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Preface 



This volume consists of papers selected from the presentations given at the Inter- 
national Workshop and Symposium on “Applications of Graph Transformation 
with Industrial Relevance” (AGTIVE 2003). The papers underwent up to two 
additional reviews. This volume contains the revised versions of these papers. 

AGTIVE 2003 was the second event of the Graph Transformation community. 
The aim of AGTIVE is to unite people from research and industry interested in 
the application of Graph Transformation to practical problems. The first work- 
shop took place at Kerkrade, The Netherlands. The proceedings appeared as vol. 
1779 of Springer- Verlags’s Lecture Notes in Computer Science series. This second 
workshop, AGTIVE 2003, was held in historic Charlottesville, Virginia, USA. 

Graphs constitute well-known, well-understood, and frequently used means 
to depict networks of related items in different application domains. Various 
types of graph transformation approaches - also called graph grammars or graph 
rewriting systems - have been proposed to specify, recognize, inspect, modify, 
and display certain classes of graphs representing structures of different domains. 

Research activities based on Graph Transformations (GT for short) consti- 
tute a well-established scientific discipline within Computer Science. The inter- 
national GT research community is quite active and has organized international 
workshops and the conference ICGT 2002. The proceedings of these events, a 
three volume handbook on GT, and books on specific approaches as well as big 
application projects give a good documentation about research in the GT field 
(see the list at the end of the proceedings). 

The intention of all these activities has been (1) to bring together the in- 
ternational community in a viable scientific discussion, (2) to integrate different 
approaches, and (3) to build a bridge between theory and practice. 

More specifically, the International Workshop and Symposium AGTIVE aims 
at demonstrating that GT approaches are mature enough to influence practice, 
even in industry. This ambitious goal is encouraged by the fact that the focus of 
GT research has changed within the last 15 years. Practical topics have gained 
considerable attention and usable GT implementations are available now. Fur- 
thermore, AGTIVE is intended to deliver an actual state-of-the-art report of the 
applications of GT and, therefore, also of GT implementations and their use for 
solving practical problems. 

The program committee of the International AGTIVE 2003 Workshop and 
Symposium consisted of the following persons: 

Jules Desharnais, Laval University, Quebec, Canada 

Hans-Joerg Kreowski, University of Bremen, Germany 

Fred (Buck) McMorris, Illinois Institute of Technology, Chicago, USA 
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Ugo Montanari, University of Pisa, Italy 

Manfred Nagl, RWTH Aachen University, Germany (Co-clrair) 

Francesco Parisi-Presicce, Univ. of Rome, Italy 
and George Mason Univ., USA 

John L. Pfaltz, University of Virginia, Charlottesville, USA (Co-chair) 
Andy Sclruerr, Technical University of Darmstadt, Germany 
Gabriele Taentzer, Technical University of Berlin, Germany. 

The program of the workshop started with a tutorial on GT given by L. 
Baresi and R. Heckel (not given in the proceedings). The workshop contained 12 
sessions of presentations, two of them starting with the invited talks of H. Rising 
and G. Karsai, respectively. Two demo sessions gave a good survey on different 
practical GT systems on the one hand and the broad range of GT applications 
on the other. 

At the end of the workshop five participants (G. Taentzer, H. Vangheluwe, 
B. Westfechtel, M. Minas, A. Rensink) gave a personal summary of their impres- 
sions, each of them from a different perspective. In order to enliven the workshop 
there were two competitions, namely for the best paper and for the best demo 
presentation, which were won by C. Smith and aequo loco by M. Minas and A. 
Rensink, respectively. The proceedings contain most of these items. 

The workshop was attended by 47 participants from 12 countries, namely 
Belgium, Brazil, Canada, France, Germany, Italy, Poland, Spain, Sweden, The 
Netherlands, the UK, and the USA. The success of the workshop is based on the 
activeness of all participants contributing to presentations and discussions. Fur- 
thermore, it is due to the work done by referees and, especially, by the members 
of the program committee. 

A considerable part of the workshop’s success was also due to the familiar 
Southern State atmosphere we witnessed at Charlottesville. Omni Hotel, the 
workshop conference site, gave us complete support from excellent meals to any 
kind of technical equipment. On Wednesday afternoon, the main social event 
was a visit to the homes of Thomas Jefferson (Monticello) and James Monroe 
(Ash Lawn), followed by the workshop dinner. Jefferson was the 3rd, Monroe 
the 5tlr president of the United States. Especially, Thomas Jefferson, also being 
the founder of the University of Virginia and the author of the Declaration of 
Independence, had a strong influence on the Charlottesville area. 

A more comprehensive report about AGTIVE 2003, written by Dirk Janssens, 
was published in the “Bulletin of the European Association for Theoretical Com- 
puter Science” and in the “Softwaretechnik- Trends” of the German Association 
of Computer Science. 

The workshop was made possible by grants given by the following organi- 
zations: Deutsche Forschungsgemeinschaft (the German Research Foundation), 
the European Union Research Training Network SEGRAVIS, the United States 




Preface 



VII 



National Science Foundation, and the Society for Industrial and Applied Math- 
ematics. In particular, the donations have allowed researchers from abroad as 
well as young scientists to come to Charlottesville by partially financing their 
travel expenses. Furthermore, the grants covered part of the organizational costs 
of the workshop. 

Last but not least, the editors would like to thank Peggy Reed, Scott Ruffner, 
and Bodo Kraft for their help in the organization of the workshop. 

March 2004 John L. Pfaltz 

Manfred Nagl 
Boris Boehlen 
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for Merging User Navigation Histories 



Mario Michele Gala, Elisa Quintarelli, and Letizia Tanca 

Dipartimento di Elettronica e Informazione — Politecnico di Milano 
Piazza Leonardo da Vinci, 32 — 20133 Milano, Italy 
galamOtiscali . it 
{quintare , tanca}@elet .polimi . it 



Abstract. Web Mining is a promising research area which mainly stud- 
ies how to personalize the Web experience for users. In order to achieve 
this goal it is fundamental to analyze the user navigations to get relevant 
informations about their behavior. In this work we consider a database 
approach based on a graphical representation of both Web sites and user 
interactions. In particular, we will see how to obtain the graph summa- 
rizing a set of user interactions from the graphs of single interactions by 
adopting the graph transformation technique. 

Keywords: Semistructured Data, User Navigation History, Web Min- 
ing, Graph Transformation. 



1 Introduction 

In recent years the database research community has concentrated on the intro- 
duction of methods for representing and querying semistructured data. Roughly 
speaking, this term is used for data that have no absolute schema fixed in ad- 
vance, and whose structure may be irregular or incomplete [1]. A common ex- 
ample in which semistructured data arise is when data are stored in sources 
that do not impose a rigid structure, such as the World Wide Web, or when 
they are extracted from multiple heterogeneous sources. It is evident that an 
increasing amount of semistructured data is becoming available to users, and 
thus the need of Web-enabled applications to access, query and process hetero- 
geneous or semistructured information, flexibly dealing with variations in their 
structure, becomes evident. More recently, interest on semistructured data has 
been further increased by the success of XML ( extensible Markup Language ) as 
an ubiquitous standard for data representation and exchange [22]. 

Most available models for semistructured data are based on labeled graphs 
(see, for example, OEM [20], UnQL [5], and GraphLog [8]), because the formal- 
ism of graph supports in a flexible way data structure variability. These models 
organize data in graphs where nodes denote either objects or values, and edges 
represent relationships between them. 

In the context of semistructured data, proposals presented in the literature for 
representing temporal information also use labeled graphs [6, 12, 19]. Recently, 
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it has been recognized and emphasized that time is an important aspect to 
consider in designing and modeling Web sites [3]: semistructured temporal data 
models can provide the suitable infrastructure for an effective management of 
time- varying documents on the Web. 

When considering semistructured data, and more in particular Web sites, it 
is interesting to apply the classical notion of valid time (studied in the past years 
in the context of relational databases) to the representation of user browsing. 
With Jensen et al. [15], we regard valid time (VT) of a fact as the time when 
the fact is true in the modeled reality. 

In this work we represent a Web site by means of a semistructured, graph 
based temporal model called TGM [19]. By browsing through a document (for 
example a hypermedia representation) each user chooses a particular path in the 
graph representing the document itself and in this way defines a personalized 
order between the visited objects. In this context, temporal queries, applied to an 
appropriate representation of the site, can be used to create site views depending 
on each user’s choices to the end of personalizing data presentation [11]. 

Monitoring and analyzing how the Web is used is nowadays an active area of 
research in both the academic and commercial worlds. The Web Usage Mining 
research field studies patterns of behavior for Web users [21]. Personalizing the 
Web experience for users is a crucial task of many Web-basecl applications, for 
example related to e-commerce or to e-services: in fact, providing dynamic and 
personalized recommendations based on their profile and not only on general 
usage behavior is very attractive in many situations. 

Some existing systems, such as WebWatcher [16], Letizia [17], WebPersonali- 
zer [18], concentrate on providing Web Site personalization based on usage infor- 
mation. WebWatcher [16] “observes” users as they browse the Web and identifies 
links that are potentially interesting to them. Letizia [17] is a client side agent 
that searches for pages similar to the ones already visited. The WebPersonal- 
izer [18] page recommendations are based on clusters of pages found by the server 
log for a site: the system recommends pages from clusters that most closely match 
the current session. Basically, these systems analyze the user navigations and 
propose some kind of related (useful) information. Instead, we have a different 
approach, based on gathering the navigation information into a (semistructured, 
graph-based) database that can be queried, for example, with a SQL-like query 
language. 

Mining mainly consists in the analysis of usage patterns through pattern 
recognition algorithms that run on user navigation information. This information 
can be represented in different ways, e.g. by means of log files. Here we consider 
for users’ log a graph-based representation using the same temporal model we 
adopt to represent Web sites. 

The novel idea of our proposal is to use a generic graph based data model 
called TGM [19] to store uniformly represented Web sites and user interactions, 
and to apply graph transformation as an algorithm to collect in a unique graph 
the information about the navigation of a group of users. From this new data 
structure we can directly extract relevant information by using a SQL-like query 
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language commonly adopted to query static or dynamic information stored in 
databases. Indeed, queries applied on our graph-based structures can be used to 
find more frequently visited pages and traversed links. These information can be 
used to optimize usability of a site by rearranging links. 

The structure of the paper is as follows: in Section 2 we present the graphical 
model we use to represent information about sites and user interactions with 
them. In Section 3 we recall the basic notion of graph transformation, and in 
Section 4 we apply the graph transformation technique for deriving global infor- 
mation about user navigation activities. Some conclusions and possible lines for 
future work are sketched in Section 5. 

2 A Semistructured Temporal Model for Representing 
Web Sites and User Navigation Activities 

The TGM temporal data model [19] we will use in this work is based on labeled 
graphs and makes use of temporal elements to store the validity of objects and 
their relationships. Here we represent a Web site as a semistructured temporal 
graph G = { N,E,£ ), where N is the set of nodes, E is the set of edges, and l 
is the labeling function which assigns to each node or edge its set of labels. In 
particular, each node has a unique identifier, a name, a type (complex or atomic) 
and a temporal element (i.e. the union of one or more time intervals). Complex 
nodes (depicted as rectangles) represent Web pages, atomic nodes (depicted as 
ovals) represent the elementary information contained in the pages. The identifier 
is depicted as a number in the upper-right corner (it is reported only for complex 
nodes for better readability). Edges have a name and a temporal element: edges 
between complex nodes represent navigational links and are labeled “Link” , 
whereas edges between complex nodes and atomic nodes represent containment 
relationships and are labeled “HasProperty” . For readability reasons we will 
omit edge labels in the examples. The temporal element of each node or edge 
states its validity time, that is, the validity period of that piece of information 
in the Web site. 

For example, in Figure 1 we show the representation of the Web site of 
a university. Note that although the Professor nodes are quite similar to each 
other in the structure, there are some differences: the Name and Office number 
can be simple strings or can be contained in an object with subobjects listing 
the Professor Name and the Office number. 

The analysis of how users are visiting a site is crucial for optimizing the 
structure of the Web site or for creating appropriate site views. Thus, before 
running the mining algorithms one should appropriately preprocess the data 
(see for example [9]): sequences of page references must be grouped into logical 
units representing user sessions. A user session is the set of page references made 
by a user during a single visit to a site and is the smallest “work unit” to consider 
for grouping interesting page references as proposed in [7, 10]. 

It is important to note that, given a Web site, modeled by a semistructured 
temporal graph G, in a specific time instant t a user U can interact with the 
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Fig. 1 . A Semistructured Temporal Graph representing a Web Site 



current view (i.e. the currently valid portion) of the site itself, which is repre- 
sented by the so-called snapshot of G at time t. For simplicity we assume that 
a site does not change during user navigation sessions, thus a user interacts only 
with the current snapshot in a given session. 

By means of an interaction, a user defines a path through the snapshot. 

A path in a semistructured temporal graph G = ( TV , E, £) is a non-empty 
sequence p = (no, n\, . . . , n m ) of nodes s.t.: Vi : 0 < i < to, n* € N and 
Vi : 0 < i < m, (rij, Link, n,+i) € E. 

The interaction of a user U in a session S with a Web site represented by 
a semistructured temporal graph G is a pair (p, TT) where: 

1. p = (ni, « 2 , . . . , n m ) is a path in the snapshot of G at time t: 

2. TT : N p — > V is a labeling function such that: 

N p is the multiset of the nodes that compose the path p, defined as N p = 

{[m,n 2 ,..., 

n m ]}, V is the set of all possible temporal elements, and Vi : 0 < i < to, 

TT{rii) = ti, where U G V is a temporal element representing the “thinking 
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Session A Session B Session C 




Fig. 2. The Global Interaction of a user 



time” of user U on node rii (that is, the time spent for visiting, or the valid 
time related to the user browsing activity). 

The set of interactions of a user U with a Web site, represented by 
a semistructured temporal graph G, is called the global interaction of U with G. 
In Figure 2 we can see an example of global interaction. For readability reasons, 
in the graphs representing user interactions we will draw only complex nodes, 
that correspond to visited pages, and not atomic nodes, that are just attributes 
related to the complex parent object and can be inferred from the Web site 
graph. 

At this point we could examine directly the global interaction, but we may 
find some problems by analyzing this structure: 

— its size grows linearly with the users’ page visits (the number of nodes is 
equal to the total number of visited pages), hence a visiting algorithm may 
require much computing resources; 

— we need to examine it thoroughly to get some information like the visit time 
of a specific page, because we have to sum the visit times of all the instances 
of that page in the global interaction; 

— our graph is actually a set of lists, thus it doesn’t exploit the opportunity of 
the graphs to have multiple and complex connections between nodes; 

— it represents raw data, and it is not intuitively readable for humans. 

For these reasons we consider to merge a set of sessions in a unique graph, 
called merge graph. 

The merge graph Ga of a global interaction A = {/i, J 2 , . . . , Ik}, com- 
posed of single interactions Ij = (( riji , Uj 2 , ■ ■ • , TTj), for j £ {1, 2, . . . , fc}, 
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Fig. 3. A Merge graph 



on a semistructured temporal graph G = ( N,E,£ ), is a graph Ga = 
(N A ,E A , TotalTT A ), where: 

1 . N a = N c ° mplex U N% tomic where: N c ° mplex = {n n , n 12 , . . . , n lmi } U 

{ri2i, ri22, 

• ■ • ,n 2 m 2 } U...U {n kl ,n k2 , • ■ ■ , n kmk } and Nf omlc = {m\l r {m) = 
atomic A 3n £ N^ mplex {{n, HasProperty ,m) £ E)}; 

2. Va, 6 £ TVa, (a, Link, b) £ E^ if and only if there is a path in A containing 
the edge (a, Link , &); 

3. Va, b £ W 4 , (a, HasProperty , b) £ E^ if and only if (a, HasProperty , b) £ E; 

4. TotalTT A ■ N A — > V is a labeling function that gives the temporal element 
related to each node n £ N A , i.e. TotalTT A (n ) = {J {njqeIj \ n jq = n } TT A n jq ) 
(the union of the thinking time intervals of node n in the global interac- 
tion A). 

Figure 3 contains the merge graph of the global interaction of Figure 2. 

The merge graph represents the activities that a user takes in a set of ses- 
sions (i.e. visited pages and clicked links). From this graph we can extract some 
relevant information useful to define the behavior of the user, e.g. the time spent 
by the user on each page, or the page that is, the last visited one. 

The merge graph represents the user navigations in a more compact way 
than the global interaction, with some advantages: 

— even with a large user interaction, the size of the merge graph (in terms of the 
number of nodes) will be limited by a constant upper bound that corresponds 
to the size of the Web site. Note that the temporal element associated to 
nodes may grow hugely, thus to solve this problem we can consider to shrink 
temporal elements periodically by deleting older information; 

— we can get useful information just by a local analysis, for example to get the 
visit time of one node we just need to examine that node and no others; 

— here we have a “real graph” , we have a more complex linked structure that 
may bring more information than a set of lists; 
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— it is more intuitively readable for humans, it can be seen as the visited subset 

of the Web site. 

The merge graph can be considered a sort of “personal view” of the Web site: 
in fact, it represents the portion of the site visited by the user; moreover, each 
node in the merge graph has as temporal element (i.e. interaction time) a subset 
of the temporal element of the corresponding node in the Web site graph (where 
the time notion represents the valid time). 

However, reducing our information to such a small data structure has some 
drawbacks. The process of merging interactions does not keep all information: 
the graph can contain cycles, so it is not possible, in general, to derive neither the 
exact sequence of visited pages nor the information about the sessions. Anyway, 
the information content of the merge graph is enough for many applications 
to get the desired knowledge about user navigations; for the mining activities 
that require the missing information, we can think to directly query the original 
global interaction. 

So far we have considered the interaction of a single particular user, but we 
can extend our approach on a more general scenario and consider the interaction 
of groups of different users to analyze their behaviors. This extension requires 
some further considerations on simultaneous visits. When building the merge 
graph for a group of users, we have to take into account that different users could 
access the same page at the same time, therefore thinking time intervals may 
overlap. Indeed, if we merge them, we would lose the information that the page 
is visited more times instead of one: hence, we need to keep the time intervals 
separated in order to have the real time spent on a page, and the union of time 
intervals will be, as in the case of a single user, just their concatenation (i.e. the 
union without merging overlapping intervals). We will mark the concatenation 
of time intervals with the symbol (+j. 

Note that these considerations hold also in the case of a single user working 
in parallel sessions: in fact, if a user could make different simultaneous sessions, 
we should keep the information that in a specific instant t a node can be visited 
contemporarily through different sessions instead of being visited in just one 
session, thus this case can be led to the one related to the group of users. 

If we keep depicted the union of time intervals that overlap partially or totally, 
we can get a non-tidy notation where some time instants belong to more than 
one interval. To avoid this problem we can consider a more powerful structure in 
which any time instant can appear at most in one interval: we can rearrange the 
temporal elements in a set of non-overlapping simple intervals (that may span 
contiguous time periods), each with an associated integer number called weight 
representing the number of times all the instants of that interval appear in the 
temporal element. 

A weighted temporal element WTE = [ si,e\) Wl 1±) [ si,e- 2 ) w 2 W ... W 
[s n , e n ) Wn is the union of weighted intervals [s*, ei) Wi ,i £ {1, 2, . . . ,n} where s,; 
is the start time, e* the end time, and Wi the weight of the interval. 

The utility of the merge graph structure (and in general of graph-based struc- 
tures reporting information about user navigation) arises if we think about the 
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queries we can apply to it, in order to obtain information useful for Web analy- 
sis purpose. As an example, we propose some intuitive SQL-like queries without 
entering into the details of the query language, defined in [19, 13]. 

— Find the most visited Degree Course page: 

SELECT DegreeCourse.Ras.Properf?/.Name 
FROM Merge Graph 

WHERE DegreeCourse->COUNT >= ALL ( 

SELECT DegreeCourse->COUNT 
FROM Merge Graph ) 

— Find the courses of Prof. Ghezzi which have been visited from his page: 
SELECT Course. HasProperty . Name 

FROM Merge Graph 

WHERE EXISTS Professor. Link.Comse 

AND Professor. HasProperty.N&me = “Ghezzi” 

— Find the office number of the professors whose pages have been visited in the 
week of Christmas 2002: 

SELECT Professor. HasProperty . Office 
FROM Merge Graph 

WHEN Professor OVERLAP [2002/12/23-00:00, 2002/12/30-00:00) 

The possibility to directly derive this kind of information from a graph based 
structure, by means of a simple query language, motivates our interest in apply- 
ing a powerful and elegant technique, such as graph transformation, in a database 
context . 

3 Basic Notions on Graph Transformation 

Graphs are well-acknowledged means to formally represent many kinds of mod- 
els, e.g. complex nodes, databases, system states, diagrams, architectures. Rules 
are very useful to describe computations by local transformations, e.g. arith- 
metic, syntactic and deduction rules. Graph transformations combine the advan- 
tages of both graphs and rules [4, 14]; here we will apply them for formalizing 
the algorithm to manipulate user interaction graphs. 

Actually, our use will exploit only a minor potential of graph transformations, 
and we refer to [2] for some more enhanced applications. 

To define the concept of graph transformation, we consider the algebraic, 
more general, notion of graph (also called multigraph). The TGM model con- 
siders edges as in the relational notion of graph (i.e. the edge set is a binary 
relation on the set of nodes), with the (label) extension which allows one to 
insert multiple edges between the same two nodes, if these edges have different 
labels (i.e. the edge set is a ternary relation between the set of nodes and the set 
of edge labels). Note that in our application of semistructured temporal graph 
the edge label is determined by the type of the nodes it connects: it is a “Link” for 
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complex— ^complex relationships and “HasProperty” for complex— » atomic rela- 
tionships. It follows that we do not have parallel edges, but we use the multigraph 
data structure anyway for convenience of notation: we can refer to the functions 
source : E — * N and target : E — > TV, which give the source and target node of 
a given edge, being understood that there cannot be parallel edges. 

A graph transformation basically consists of deriving a graph from another 
one by applying a rule that transforms it by replacing a part with another 
subgraph. 

Very briefly, a graph transformation rule r = (L, R , K , glue, emb, appl ) con- 
sists of: two graphs L and R, respectively called the left-hand side and the 
right-hand side of r; a subgraph K of L called the interface graph ; an occur- 
rence glue of K in R , relating the interface graph with the right-hand side; an 
embedding relation emb, relating nodes of L to nodes of R; a set appl specifying 
the application conditions for the rule. 

An application of a rule r = (L, R, K, glue, emb, appl) to a given graph G 
produces a resulting graph H, if H can be obtained from G in the following steps: 
choose an occurrence of L in G; check the application conditions according to 
appl; remove the occurrence of L with the exclusion of the subgraph K from G, 
including all dangling edges , yielding the context graph D of L (that still contains 
an occurrence of K); glue D and R according to the occurrences of K in them 
(that is, construct the disjoint union of D and R and, for every item in K, 
identify the corresponding item in D with the corresponding item in R), yielding 
the gluing graph E; embed R into E according to emb: for each removed dangling 
edge that was connecting a node v G G and the image of a node v\ £ L in G, 
Vi >2 G R such that (ui, U 2 ) € emb, add a new edge (with the same label) between v 
and the glued image of V 2 in G. 

4 Applying Graph Transformations 

In the previous section we have seen the basics of graph transformation, and now 
we apply this technique to build the merge graph from the global interaction, 
and also to combine temporal elements. 



4.1 Merging User Navigation Histories 

For the purpose of transforming the global interaction to get the merge graph, we 
just need one transformation rule, called Rmerge, shown in Figure 4. Intuitively, 
each application of the rule merges two nodes that have the same id and label, 
unifying their temporal elements and collapsing all their ingoing and outcoming 
edges to the same node. 

In Figure 5 we show the application of the rule in a general case. Figure 6 
briefly shows the steps to build the merge graph from the global interaction 
depicted in Figure 2. For simplicity, a node just contains its identifier originally 
reported in the Web site graph. The nodes involved in every step are marked 
with dashed circles and arrows. 
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Fig. 4. The transformation rule Rmerge for creating the merge graph 
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Fig. 5. Rule application in a general case 



In graph transformation, non-determinism may occur, and this can happen 
at two levels: 

1. if we have a set of transformation rules, we have to choose one among the 
applicable rules: in our particular context, this issue will not cause concern 
because we have only one transformation rule; 

2. given the chosen rule, there could be several occurrences of its left-hand side 
in the graph. This question requires some deep considerations: given a graph, 
it is possible that we have to take a decision about which occurrence we 
should first apply the rule to. But this is a trivial problem: in fact, it can 
be proved that the iterated application of our rule will always converge to 
the same solution, unambiguously given by the definition of merge graph, 
independently of the taken decisions. 

Proposition 1. Given a global interaction A over a semistructured temporal 
graph G, applying Rmerge to A yields a unique merge graph Ga- 
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Fig. 6. Examples of rule application 
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Fig. 7. Graph representation of Temporal Elements 



Basically, the proof uses the facts that the merge graph contains all the nodes 
of the global interaction but without any repetition, and on the commutativity 
of the union (between time intervals in our context) operator. 

We can easily compute how many times we have to apply the rule: given 
a Web site whose graph representation is composed of nodes from the set N = 
{ni,ri 2 ,...,n m }, the global interaction defines a function num-occ : N — > N 
that returns the number of occurrences of each node in the interaction itself. 
Hence the number of rule applications is , MAX (num-occ(rii) — 1,0). 

4.2 Union of Temporal Elements 

We can apply graph transformation for the purpose of merging temporal ele- 
ments too. As we previously said, if we unify these time-based structures we 
lose the information that a node is visited more than once: indeed, we consider 
the following rules just as a base step to define the ones for weighted temporal 
elements, as we will see in the next paragraph. 

To be able to apply graph transformation, we represent a temporal element 
with an unconnected graph where each node corresponds to a single time interval 
and has two attributes, start and end. In Figure 7 we represent a generic temporal 
element [si,ei) l±l [s 2 , e 2 ) W • . • W [s n ,e n ). 

In Figure 8 we show the two graph transformation rules that merge temporal 
elements. The first rule is used to merge two simple intervals such that one is 
contained in the other (total overlapping); the second rule is applied when there 
is just a partial overlapping between two intervals. In both cases we replace 
the nodes with one new node having an interval that covers the original union 
interval. For this graph transformation the interface K of the left-hand side L 
is empty and its occurrence glue in the right-hand side R is empty as well. The 
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Fig. 8. Transformation rules for Temporal Elements 



start: s, 




start: s 2 




start: s n 


end: ej 




end: e 2 




end: e n 


weight: Wj 




weight: w, 




weight: w n 



Fig. 9. Graph representation of Weighted Temporal Elements 



embedding function is empty, as a consequence that there are no edges, in fact 
the embedding function has just the purpose of restoring the removed dangling 
edges. 

4.3 Union of Weighted Temporal Elements 

We can extend the application of graph transformation to unify weighted tem- 
poral elements. To represent them with a graph, we need to add the weight 
attribute to nodes. In Figure 9 we represent a generic weighted temporal ele- 
ment WTE = [si,ei) Wl W [s 2 ,e 2 ) W2 W ... W [s n ,e n ) Wn . 

The transformation rules (see Figure 10) have some similarities with the ones 
of Figure 8, in the sense that we still have two rules that merge total and partial 
overlapping intervals, respectively. The difference is that here, in the left-hand 
side, we have two nodes with weights w± and w 2 respectively. Thus, the result 
of the rule application will transform them in nodes, corresponding to time 
intervals, with three possible different weights: w\, w 2 and w\ + w 2 for intervals 
containing respectively time instants in only the first, only the second and both 
the first and the second time interval of the original graph. In general, the result 
of a rule application will produce three nodes, but some of them may have the 
same start and end time: in this case a redundant node can be removed, and the 
third rule applies to this purpose. 

5 Conclusion and Future Work 

In this work we used a graphical temporal model for representing time-varying 
Web sites and user interaction activities while navigating the Web. More partic- 
ularly we discussed in some detail the possibility to apply graph transformation 
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Fig. 10. Transformation rules for Weighted Temporal Elements 



in order to obtain a graph-based structure containing a summary about the 
navigation activities of a user or a group of users. 

As a future work, we plan to implement a system, based on this work, to 
customize the Web experience by using a graph transformation tool. 
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Abstract. One of the challenges faced by Web developers is how to cre- 
ate a coherent application out of a series of independent Web pages. This 
problem is a particular concern in Web development because HTTP as 
underlying protocol is stateless. Each browser request to a Web server is 
independent, and the server retains no memory of a browser’s past re- 
quests. To overcome this limitation, application developers require a tech- 
nique to provide consistent user sessions on the Web. Before implement- 
ing a Web application, developers have to decide which session data is to 
store. In this paper, we provide a modelling approach for powerful and 
flexible Web session management, based on UML. We propose the defi- 
nition of a session model which contains version management issues. The 
validation of a session model concerning consistency issues is possible, 
due to the formal basis of our approach using graph transformation. 



1 Introduction 

State management is the process by which you maintain state and page infor- 
mation over multiple requests for the same or different pages. As is true for any 
HTTP-based technology, Web form pages are stateless, which means that they 
do not automatically indicate whether the requests in a sequence are all from 
the same client or even whether a single browser instance is still actively viewing 
a page or site. Furthermore, pages are destroyed and recreated with each round 
trip to the server; therefore page information will not exist beyond the life cycle 
of a single page. There are various client-side and server-side options for state 
management. 

Storing page information using client-side options doesn’t use server re- 
sources. However, because you must send information to the client for it to 
be stored, there is a practical limit on how much information you can store this 
way. Client-side options are URL extensions, hidden fields, and cookies. State 
information can be added to a URL (Uniform Resouce Locator) as additional 
path information. By hidden fields, state information may be stored inside the 
fields of an HTML document. Hidden fields are useful in Web forms, since the 
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value associated with the field is send to the server when the form is submitted. 
Another very popular approach to do session management is to use cookies. 

To design session handling in Web applications adequately, several questions 
have to be answered. First of all, what is the important session-specific data 
to be held? Then, which changes have to be recorded? Further questions occur 
concerning the security level of different kinds of information. 

In general, a Web application is well designed following the model- view- 
controller approach (compare e.g. Struts in the Apache Jakarta Project [1]). 
The Web pages (client and server pages) build the view on the business model 
which contains persistent business data and business logic. It is e.g. capsuled in 
an Enterprise Java Bean on the server site. A dispatcher in form of a servlet 
controls the input and leads it to the business model. For a first design of the 
session management, we propose a session model which contains structural as 
well as behavioural aspects of session handling. It contains a representation of 
the session-specific data together with version management aspects to keep track 
of all important state changes in sessions. 

Several approaches to UML-based ([14]) Web engineering exist which are 
sketched and compared in [8]. To describe the structural aspects of Web pages 
we use the Web Application Extension (WAE) [3] to UML. Besides the structural 
aspects, also the workflow of the guided input processing has to be modelled. 
This is done by state diagrams containing different kinds of actions, e.g. the user 
can follow a link, submit a form, etc. A comprehensive approach is the UML 
Web Engineering approach (UWE) [9]. Here, UML is used to design a Web 
application on different layers. Especially the runtime layer deals with sessions 
and provides a history functionality for all activities performed by the user. 
The user can browse through instances, modifiy them or is just inactive. Our 
approach can be seen in close relation with UWE where sessions are modelled 
in a way that the revision structure of session objects is explicitly handled. It 
is possible to navigate within an hierarchical revision structure and to compose 
complete previous session states from there. 

Having an executable session model at hand, the consistency of session man- 
agement can be tested already at design time. But the testing of a session model 
is confronted with some principal problems: Sessions are needed for complex 
transactions in Web applications which can incorporate a number of dependent 
Web pages. The user has the possibility to arbitrarily jump forward and back- 
ward on such dependent Web pages by navigation facilities. This behaviour leads 
to an explosion of test cases. Furthermore, it is hard to attempt all possibilities 
of entries in a given form submission which might lead to different follow-up 
pages. 

Due to this testing dilemma, we propose a session model validation con- 
cerning certain consistency constraints. Such a semantic consistency checking is 
possible if we translate the session model (given as UML model) into the se- 
mantic domain of graph transformation [12] where constraint checking facilities 
are available. Considering session states as graphs and state changes as graph 
transformations leads to a formal session model. Consistency constraints can be 
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Fig. 1. On-line shopping scenario 



formulated in OCL [15]. In this paper, we concentrate on the formulation of (a 
restricted form of) invariants. They can be used to express that certain safety 
conditions hold during a session. For constraint checking of the session model, 
the OCL constraints are translated into a set of graph formulae. Then, the for- 
mal model has to be validated by doing consistency checking on the initial state 
graph and showing that all rules preserve the consistency. 

1.1 Running Example 

Our running example is taken out of the area of e-commerce. An on-line customer 
can browse through the products presented in a catalog. When the customer finds 
an item to buy, it can be added to a shopping cart. Before purchasing the items 
they are listed. In that situation the customer still has the chance to change the 
list which means deleting items or adding new ones. No matter how the item 
list changes, the shopping cart has always to show the actual number of items 
as well as the right overall sum. 

Let’s consider the concrete scenario in Figure 1: The customer selects one 
item and then another one. The shopping cart shows that the number of items 
is equal to 2. Listing the items the customer deletes the first one. Choosing the 
second one for changing it, the cart is shown with a number of items equal to 
1. Jumping back three pages, item 2 is shown together with the cart containing 
two items. Buying another copy of item 2 it is not clear how many items are 
altogether in the cart. Since the customer jumped back to a past state and 
continued that, the number of items is now 3. But there is also e-commerce 
software which would take into account that the customer deleted one item in 
between and would set the number of items to 2. In this case, buying an item 
would not increase the number of items. A behaviour which cannot be followed 
easily. But even when a number of items equal to 3 is depicted, it happens that 
the corresponding item list contains only two items. Thus we are running into 
a consistency problem: The number of items in the shopping cart does not always 
correspond with number of items selected. 

In the following, we develop a session model for this example where we have 
the possibility to jump back to complete session states such that inconsistencies 
are avoided. 
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2 Session Modelling 

In a general setting, the session model is part of the business model. It keeps 
the session specific information as long as the session persists. Each time the 
information for a business action is completely available and in a consistent 
form it is transferred to the business model. In the UML-based Web Engineer- 
ing approach [9], the runtime layer deals with sessions and provides a history 
functionality for all activities performed by the user within a session. The user 
can browse through instances, modifiy them or is just inactive. Components such 
as pages are stored and have instantiations. Mapping instantiations to session 
objects and components to session classes, our session meta model could ex- 
tend the UWE approach by versioning facilities. Completing the description by 
collaborations concerning typical session operations such as opening a session, 
creating a session object, editing or deleting it, etc. leads then to an executable 
model. 

2.1 Session Modeling with Versioning Facilities 

The session model keeps all user input data and user requested information which 
is essential for the running session. We use UML class diagrams to describe the 
structural aspects of the session model, session management, we consider a slight 
UML extension for the revision management in session models Figure 2 in the 
following: We consider a slight UML extension for the revision management in 
session models Figure 2 in the following: 

— We consider a stereotype of objects, called session objects, which have a new 
tagged value for objects: isCurrent. It indicates the current state of a session 
object. 

— Session objects of one type are collected in a session class, a stereotype of 
classes. 

— There is a new dependency between session objects: <<revision>> which 
is a stereotype of dependency relations. This dependency is ordered to keep 
track of the order of revisions on one and the same session object. 

In [9], a number of consistency constraints are given for sessions formulated 
in OCL. Here, we add two further ones focussing on the revision structure of 
session objects: 

1. Each session object is the revision of at most one other session object. 

(1) context SessionObject inv: self .history . origin. size 0 <= 1 

2. Exactly one session object in one revision tree is current. 

To formulate this constraint by OCL, we need the following additional op- 
erations: 

allRevisions : Set (SessionObject) ; 
allRevisions = self .revision. derived — * 

union (self .revision. derived. allRevisions) ; 



Towards Validation of Session Management in Web Applications 



19 




Fig. 2. Metamodel extension for session management 



allOrigins : Set (SessionObject) ; 
allOrigins = self .history. origin— > 

union (self .history . origin. allOrigins) ; 

(2) context SessionObject inv: ( (self . isCurrent = true) xor 

self . allRevisions->select (isCurrent = true)— >size = 1) xor 
self . allOrigins->select (isCurrent = true)— »size = 1 

The behavioural aspects of sessions are described by special collaborations 
which show the typical flow of activities. The activities are method calls. Here, 
we do not concentrate on the contents of the methods is simple, but only state 
important constraints which have to be valid after a method call. Actions which 
are important for the session such as user input actions or input dependent 
business computations (intermediate input validation or configuration settings) 
would cause revisions of corresponding session objects. Session actions such as 
backward and forward jumping do not cause new revisions but reset the current 
session state. 



Example The session model has to keep track of all items in the shopping cart. 
Each time the user puts a new item into the cart, changes it or deletes it from 
the item list, this action has to be reflected in the session model. Thus, we have 
to keep track of the item list, all the items in the list and the cart. This results 
in three session classes in the class diagram of the session model in Figure 3. The 
session model is connected with the Web page model by server page EditList 
which delegates user actions changing an item or the item list, to the item list. 

One specific session state is depicted in the object diagram in Figure 4. It 
shows the state of a session where the user has put one item into the cart and 
jumped back to the initial page showing e.g. a special offer. Performing then 
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Fig. 3. Session model Class diagram 




Fig. 4. A session state 



another put action leads to a new revision branch. The framed part shows the 
session state before the last put action took place. 

Putting an item into the cart, changing it or deleting it from the item list 
are user actions which cause new revisions in the session model. We model two 
of them by the collaborations in Figures 5 and 6. These are collaborations on 
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Fig. 5. Collaboration for changing an item 




Fig. 6. Collaboration for putting an item into the list 



the instance level and show the method flow within the session model. Main 
constraints are described using OCL. 

3 Session Validation 

After building up a session model we present how to use it to validate consis- 
tency properties. The validation of semantical consistency conditions is possible 
if we translate the session model which is given as UML model, into a semantic 
domain. We choose graph transformation as semantic domain, since there is the 
possibility to not only test consistency constraints but also validate the complete 
behaviour of the session model. Considering session states as graphs and state 
changes as graph transformations leads to a formal session model. State changes 
are caused by actions which are formalized to graph rules. 

Besides syntactic consistency constraints given in the previous section, also 
semantic constraints can be formulated by OCL. Syntactic constraints are formu- 
lated for abstract syntax graphs being instances of the UML meta-model, while 
semantic constraints are formulated on one semantic model which is a graph 
transformation system in our case. Here, a state is modelled by a graph and all 
graphs have to satisfy the semantic constraints. 

In general, checking of OCL constraints can mean twice: (1) to test if concrete 
instances (abstract syntax graphs or states) are consistent or (2) to validate if 
the set of all instances (abstract syntax graphs or states) is consistent. 



22 



Anilda Qemali and Gabriele Taentzer 



The USE tool [11] and the Dresden OCL Toolkit [6] provide concepts and 
tool support to parse and compile OCL constraints. They are useful to check 
syntactic consistency of a UML model as well as to test semantic consistency. 
They do not support the proof of semantic consistency constraints, i.e. validate 
that all possible states of a semantic model are consistent. 



3.1 Semantic Consistency Constraints 

Similar to syntactic constraints also semantic consistency constraints can be for- 
mulated by OCL. In this section, we concentrate on the formulation of invariants 
within the session model and show sample invariants important for our running 
example. They describe the main two safety constraints in sessions. 



Example One if not the most important invariant is the following: The number 
of items in the shopping cart has to correspond with the number of items in the 
selection list. This invariant can be expressed in OCL as follows: 
context Cart inv consistentNumberOf Items : 
self. number = self . itemList . item— >size () 

The second important constraint states: The sum of items’ costs in the shop- 
ping cart is the sum of items ’ prices in the list. The OCL formulation of this 
constraints is: 

context Cart inv consistentSum : 
self. sum = self . itemList . item .price— >sum() 



3.2 A Semantic Model 

Graphs Graphs are often used as abstract representation of diagrams, e.g. of 
UML diagrams. In the following, we consider typed attributed graphs. The ma- 
nipulation of graphs is performed by the so-called double-pushout approach to 
graph transformation [4]. It was extended to the attributed case in [13]. 

In object-oriented modelling, the structural aspects can be described on two 
levels: the type level (modelled by class diagrams) and the instance level (mod- 
elled by object diagrams). Semantically, this coherence is mapped to typed 
graphs where a fixed type graph T serves as abstract representation of the 
class diagram. Its instances are mapped to graphs equipped with a structure- 
preserving mapping to the type graph, formally expressed by a graph homomor- 
phism. 



Example Translating the UML model of our running example to a graph trans- 
formation model, we first have to construct the type graph which is depicted in 
Figure 7. It contains a graph representation of the class diagram in the session 
model extended by revision edges and attributes is Current for all vertices which 
originate from the session model. These edges and attributes are the semantical 
representation of session classes as containers for session objects. Session states 
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Fig. 7. The type graph 



as the ones depicted in Figure 4 by object diagrams can be mapped straight- 
forward to instance graphs. If tagged value isCurrent is shown in a session object, 
the corresponding attribute has value true, otherwise it has value false. It is ob- 
vious that such instance graphs are typed over the type graph in Figure 7. 



Graph Rules After having defined session states as typed attributed graphs, 
actions are formalized by graph rules describing how session states may change. 
Identities of vertices are represented by dashed arrows. Identities of edges are de- 
duced from the vertices they are connecting and their types. We use the double- 
pushout approach [4] which is type graph compatible. Furthermore, multi-objects 
are needed. Rules with multi-objects represent rule schemes which expand to 
a countably infinite set of graph transformation rules, one for each legal multi- 
plicity of the multi-object. When applying such a rule to a given graph, always 
the maximal rule is chosen among all applicable ones. 



Example Actions changelt and put described in collaborations in Figures 5 
and 6 are translated to rules in Figures 8 and 9. The rule name and parameter 
list correspond to those of the corresponding method. Rule put produces new 
revisions of the item list and the cart. The resulting item list contains all previous 
items with the new one in addition. The previous items are depicted by a multi- 
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Fig. 8. Rule changelt(in i:Item, s:Int, c:Enum) 



object which is mapped to all item vertices connected to the current list vertex 
when the rule is applied. Rule changelt changes attributes of an item which leads 
to a new revision of that item. The rule also has to create a new revision of the 
item list, although only the item is changed. But assuming that several items 
are changed one after the other, this order has to be recorded which is done by 
revising the item list. 

Graph Transformation A graph transformation from a pre-state graph G to 
a post-state graph H describes changes on a concrete session state. The struc- 
tural and behavioural aspects of a session model can be formally represented 
by a typed graph transformation system GT S = (T, I, R) consisting of a type 
graph T, an initial graph I which is an instance graph typed over T, and a pos- 
sibly infinite set R of graph rules with all left and right-hand sides typed over T. 
Infinite sets of rules are necessary, because multi-objects can occur and rules 
with multi-objects represent rule schemes. 

Example Considering the session state in Figure 4 represented as graph, an 
application of rule put to the framed part would lead to the whole graph repre- 
sentating the follow-up session state. 

The whole formal session model for our example consists of the type graph in 
Figure 7, the initial graph in Figure 10 showing the initial state of a session and 
a set of rules. This set comprises rule changelt and put as depicted in Figures 8 
and 9 as well as a rule delete not explicitly shown. These are the action rules 
in our formal session model. Moreover, it contains rules for forward and back- 
ward jumping which do not insert new revisions, but just change the isCurrent 
attribute along the revision structure. Backward jumping is specified by rules 
jumpBackl and jumpback2 in Figures 11 and 12. Rule jumpBackl is used when 
the previous action was a put or delete. In this case, the cart as well as the item 
list have been revised, but not the listed items. Rule jumpBack2 is used if the 
previous action was a change of an item. In this case, this item and the list, but 
not the cart have been revised. Rules for forward jumping look similar. 
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Fig. 9. Rule put(in n:String, s:Int, c:Enum, p:Int) 




Fig. 10. The initial state 





Fig. 11. Rule jumpBackl 



3.3 Consistency Checking 

In section 3.1, we showed how semantic session consistency can be stated by 
invariants in OCL. To validate a given session model, we have to translate these 
OCL invariant to a set of graph constraints. 

In a first approach, this is possible, if the set of OCL is restricted in a way 
that the existence of certain object structures and attribute values is required or 
prohibited. This kind of atomic constraints can be directly translated to graph 
constraints which describes the existence or non-existence as well as the unique- 



26 



Anilda Qemali and Gabriele Taentzer 








Fig. 12. Rule jumpBack2 




Fig. 13. Constraint consistentNumberOfltems and consistentSum 



ness of certain graph parts or compares attribute values. Moreover, propositional 
formulae on this kind of atomic constraints can be translated to graph formulae. 

A graph transformation system is consistent wrt. a set of graph formulae if 
the initial graph satisfies them and all rules preserve them. In [7], an algorithm is 
presented which checks whether a rule preserves graph constraints. If a constraint 
is not preserved, new pre-conditions are generated for this rule restricting its 
applicability such that the consistency is always ensured. Recently, this checking 
algorithm has been extended to attributed graph transformation and to graph 
formulae in [10]. It has to be applied to all rules in the formal session model. To 
validate a session model, the initial session state graph has to satisfy all graph 
constraints. Thereafter, we validate the action rules. 

Often multi-objects are also useful when formulating graph constraints. If 
multi-objects occur in graph constraints, we do not just look for existence or 
non-existence of the corresponding graph structures, but look for its maximal 
occurrence similarly to the matching of a rule with multi-objects. Thus, having 
one multi-object in the constraint, we end up with set {ci\i £ 1} of constraints 
where c n contains n copies of the multi-object. 



Example OCL constraints consistentNumberOfltems and consistentSum are 
translated to graph constraints. In Figure 13 these graph constraints are de- 
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cart. sum = cart.sum@pre + p} 



Fig. 14. Completed collaboration put 



picted using both multi-objects. It is obvious that the initial state in Figure 10 
fulfills these constraints. Checking the rules, we find out that rule changelt fulfills 
both constraints, while rule pwf satisfies constraint consistentNumberOfltemsbut 
not constraint consistentSum. The algorithm would equip rule put with a pre- 
condition stating that y = y + p which means that the rule can only be applied 
if p = 0. The problem lies in not updating attribute sum of the Cart vertex. 
Comparing rule put with its corresponding collaboration in Figure 6, we find 
out that constraint c . sum = c . sumQpre + p is missing as post-condition of ap- 
plying method update. Compare the completed version of this collaboration in 
Figure 14. 



Tool Support for Validation The validation of graph constraints in a formal 
graph transformation model is supported by the graph transformation engine 
AGG (see [5]). The tool provides several visual editors to view and edit graph 
transformation systems, an interpreter to simulate concrete scenarios, and a 
debugger. Recently, an initiative has been started to implement static analysis 
techniques for graph transformation such as consistency checking and critical 
pair analysis to determine conflicts and dependencies between different actions. 
Graph formulae as described above are supported, except of the usage of multi- 
objects within graph constraints. It is up to future work, to extend constraint 
checking for this case. To use AGG for session validation, a UML model which 
has been produced by some CASE tool has to be translated to the AGG input 
format such that the corresponding sentic model is constructed. Assuming that 
the UML model is given in XMI (XML Meta data Interchange) format [2], an 
XSL (Extensible Stylesheet Language) transformation has to be provided to 
produce a GGX (Graph Grammar Exchange) document as input for AGG. If 
the validation results in transformed rules, they have to be retranslated to XML 

4 Conclusion 

Session management is a major issue in the design of Web applications. In this 
paper, we presented a UML-based approach to session modelling which sup- 
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ports the description of session-specific data together with version management 
aspects. Future work on this approach has to further investigate the integration 
of this work into the comprehensive UML-based Web Engineering approach by 
Nora Koch [9] . Translating the UML model of a session to a formal model based 
on graph transformation, the semantic consistency of a session model can be 
validated. Having formulated consistency conditions within a restricted form of 
OCL, they can be translated to graph formulae which are then used to validate 
the formal session model. As pointed out this validation process leads to a for- 
mulation of important session-specific conditions in collaborations. Although 
there are a number of approaches to session management at the design level, 
our approach adds explicit version management issues and offers the possibil- 
ity to validate semantic consistency constraints for session management in Web 
applications (based on UML). 

Consistent session design is not only a key issue for the development of Web 
applications, but plays an important role when designing custom wizards in 
any application. Although presented within the setting of Web applications, our 
approach seems to be general enough to design consistent session management 
in any application. It is up to future work to show the general usability of the 
session modelling and validation approach we have presented. 
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Abstract. Graph reduction specifications (GRSs) are a powerful new 
method for specifying classes of pointer data structures (shapes). They 
cover important shapes, like various forms of balanced trees, that cannot 
be handled by existing methods. 

This paper formally defines GRSs as graph reduction systems with a 
signature restriction and an accepting graph. We are mainly interested 
in PGRSs — polynomially-terminating GRSs whose graph languages are 
closed under reduction and have a polynomial membership test. 

We investigate the power of the PGRS framework by presenting example 
specifications and by considering its language closure properties: PGRS 
languages are closed under intersection; not closed under union (unless 
we drop the closedness restriction and exclude languages with the empty 
graph); and not closed under complement. 

Our practical investigation presents example PGRSs including cyclic 
lists, trees, balanced trees and red-black trees. In each case we try to 
make the PGRS as simple as possible where simpler means fewer rules, 
simpler termination and closure proofs and fewer non-terminals. We show 
how to prove the correctness of a PGRS and give methods for demon- 
strating that a given shape cannot be specified by a PGRS with certain 
simplicity properties. 



1 Introduction 

Pointer manipulation is notoriously dangerous in languages like C where there 
is nothing to prevent: the creation and dereferencing of dangling pointers; the 
dereferencing of nil pointers or structural changes that break the assumptions of 
a program, such as turning a list into a cycle. 

Our goal is to improve the safety of pointer programs by providing (1) means 
for programmers to specify pointer data structure shapes, and (2) algorithms to 
check statically whether programs preserve the specified shapes. We approach 
these aims as follows. 

1. Develop a formal notation for specifying shapes (languages of pointer data 
structures); that is the main concern of this paper. We show how shapes can be 
defined by graph reduction specifications (GRSs), which are the dual of graph 
grammars in that graphs in a language are reduced to an accepting graph rather 

* Work partly funded by EPSRC project Safe Pointers by Graph Transformation]!]. 
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Fig. 1 . A graph reduction specification of binary trees 



than generated from a start graph. Polynomially terminating GRSs whose lan- 
guages are closed under reduction (PGRSs) allow a simple and efficient mem- 
bership test for individual structures, yet seem powerful enough to specify all 
common data structures. 

2. The effect of a pointer algorithm on the shape of a data structure is cap- 
tured by abstracting the algorithm to a graph rewrite system annotated with 
the intended structure shape at the start, end and intermediate points if needed. 
A static verifier then checks the shape annotations (see [3]). 

Example 1 (Specifications of binary trees and full binary trees) 

Fig. 1 gives a graph reduction specification of binary trees. The smallest bi- 
nary tree is a leaf. We can draw it as Accl, the accepting graph , a single node 
labelled L. Trees may contain unary or binary branches. Therefore any other 
binary tree can be reduced to Accl by repeatedly applying the reduction rules 
UtoL and BtoL. These replace bottom-most branches, whose arcs point to leaves, 
by a leaf. The “l” indicates that any arcs pointing to the branch are left in place 
by the reduction rule. Full binary trees are specified by omitting the rule UtoL 
so that each node is either a leaf or a binary branch. 

This reduction system only recognises trees because applying the inverse of 
its rules to any tree always produces a tree. Intuitively, forests cannot reduce to 
a single leaf as the rules do not break up graphs or connect broken graphs; no 
rule reduces a cycle; rules are matched injectively so BtoL cannot reduce a DAG 
with shared sub-trees; our signatures, introduced later, limit node outdegree so 
branches must be unary or binary. □ 

Graph reduction is a very powerful specification mechanism, we show how 
it can be used to define various kinds of balanced binary trees. Some shapes are 
harder to specify than others; we categorise shapes according to whether their 
PGRS needs non-terminal node labels; the difficulty of proving termination and 
closedness under reduction are also indicative of shape complexity. Some difficult 
languages can be specified as the union or intersection of simpler languages; we 
consider how the power of single PGRSs compares with such combinations. 

Although many of our examples are trees, a grap Abased specification frame- 
work is essential because we need precise control over the degree of sharing. 
Term rewriting ignores this issue and algebraic type specifications are unable to 
guarantee that members of tree data types are trees. Previous work on shape 
specifications uses variants of context-free graph grammars, or certain logics, 
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Fig. 2. A graph reduction specification of cyclic lists 



which are unable to express properties like balance [12, 15, 5, 14, 8]. GRSs are 
far more powerful than the syntactic type restrictions expressible in languages 
like AGG, PROGRES and Fujaba. The suitability of general-purpose specifica- 
tion languages like OCL for specifying and checking shapes is unclear. PGRSs 
can define shapes with sharing and cycles. Our second example presents cyclic 
lists. 

Example 2 (Specification of cyclic lists) 

Fig. 2 gives rules defining cyclic lists. A single loop, Accc, is a cyclic list and 
all other cyclic lists reduce to Accc- Two-link cycles are reduced by TwoLoop. 
Longer cycles are reduced a link at a time by Unlink. 

Clearly a graph of several disjoint cycles will not reduce to a single loop; no 
rules reduce branching or merging structures, and acyclic chains cannot become 
loops. □ 

The rest of this paper is organised as follows. Section 2 defines GRSs. Sec- 
tion 3 discusses polynomial GRSs (PGRSs) and their complexity for shape check- 
ing. Section 4 discusses power, showing when shapes are undefinable without 
non-terminals and demonstrating the closure properties of PGRS languages. 
Section 5 applies our theory to specify red-black trees. Section 6 discusses re- 
lated work. Section 7 concludes. Proofs are omitted from this paper, they are 
given in the full technical report [2] . 

2 Graph Reduction Specifications 

This section describes our framework for specifying graph languages by reduc- 
tion systems. We define graphs, rules and derivations as in the double-pushout 
approach [10], and add a signature restriction to ensure that graphs are models 
of data structures and that rules preserve the restriction. The running example 
builds a specification of balanced binary trees (BBTs) — binary trees in which 
all paths from the root to a leaf have the same length. 

Definition 1 (Signature) 

A signature S = (^V ,^n ,^E, type : — > p(%)) consists of a finite set of 

vertex labels , a set of non-terminal vertex labels ©jv such that C ffy, 
a finite set of edge labels ( &e and a total function type assigning a set of edge 
labels to each vertex label. □ 
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Fig. 3. Two Ebt- total graphs. The right one is a BBT, the left one is not 



Intuitively, graph vertices represent tagged records. Their labels are the tags. 
Outgoing edges represent the record pointer fields of which each tag has a fixed 
selection defined by type. Edge labels in % correspond to the names of pointer 
fields. Non-terminal labels may occur in intermediate graphs during reduction 
but not in any graph representing a pointer structure. There is no need to restrict 
the permissible target node labels for edges in the signature because the reduc- 
tion rules introduced below can encode any such restrictions. In the following, 
£ always denotes an arbitrary but fixed signature (^V, ^n, ^e, type). 

Example 3 (Binary tree signature) 

Let Ebt = ({B, U, L}, {}, {l, r, c}, {B > {/, r}, U i— > {c}, L i— > {}}). Tree nodes 
are labelled I?(inary branch), [/(nary branch) or L(eaf). There are no non- 
terminals. Arcs are labelled Z(eft), r(ight) or c(hild). Binary branches have left 
and right outgoing arcs, unary branches have a child and leaves have no arcs. □ 

Definitions 2, 3 and 4 below are consistent with the double-pushout approach 
to defining labelled graphs, morphisms, rules and derivations (see [10]; [LI] con- 
siders graph relabelling). Fig. 3 shows two example graphs over Ebt- 

Definition 2 (Graph) 

A graph over E, G = (Vg, Eg, sg, tG, Ig, m G) consists of: a finite set of ver- 
tices V G - a finite set of edges Eq; total functions s Gl t G '■ Eg — > Vq assign- 
ing a source and target vertex to each edge; a partial node labelling func- 
tion l G : Vg — > V?v (a partial function / : A — > B maps domf , a subset 
of A, to B. We write f(x) = _L when x ^ domf); and a total edge labelling 
function me '■ Eq — > c ^e- □ 



Definition 3 (Morphism, inclusion and rule) 

A graph morphism g : G — > H consists of a node mapping gv : Vg — > Vh 
and an edge mapping gE '■ Eq — > Eh that preserve sources, targets and la- 
bels: s H og E = gv°SG, t H og E = gv°tG , rn H °g E = m G and l H {gv{x)) = l G {x) 
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for all nodes x where Ig{x) 7^ JL (/(&) = X means / is undefined for x ). An iso- 
morphism is a morphism that is injective and surjective in both components 
and maps unlabelled nodes to unlabelled nodes. If there is an isomorphism 
from G to H they are isomorphic , denoted by G = H. Applying morphism 
g : G — > H to graph G yields a graph gG where: V g c = 9 vVg (i.e. apply gy 
to each node in Vq)', E 9 g = gsEc', so(e) = n <t=> s g G(gE{c)) and similarly for 
targets; mc(e) = m <t=> m g G(gE(e)) = to; IgW) = l 4=> l g G(gv{n)) = l. A graph 
inclusion H D G is a graph morphism g : G — > H such that g{x) = x for all 
vertices and edges x in G. Note that inclusions may map unlabelled nodes to 
labelled nodes. 

A rule r = (L D K C R) consists of three graphs: the interface graph K and 
the left and right graphs L and R which both include K. □ 

Intuitively, a rule deletes nodes in L — K, preserves nodes in K and allocates 
nodes in R — K. In [10] rules may merge nodes but we have no need for this more 
general formulation here. Our pictures of rules show the left and right graphs; 
the interface graph is always just the set of numbered vertices common to left 
and right. For example, the interface of BtoL in Fig. 1 consists of the unlabelled 
node 1 . So BtoL deletes two leaf nodes and two arcs, and preserves node 1 which 
is relabelled as a leaf. 

Definition 4 (Direct derivation) 

Graph G directly derives graph H through rule r = (L D K C R) and mor- 
phism g , written G =>■ H, G => r H or G => r ,g H, if there is an injective graph 
morphism g : L — > G such that: 1. no edge in G — gL is incident to a node in 
gL — gK (the dangling condition ); 2. H = H 1 where H' is constructed from G 
as follows: (i) remove all vertices and edges in gL — gK (and restrict sg, to, Ig 
and ?n <3 accordingly) to obtain a subgraph D of G, (ii) add disjointly all vertices 
and edges (and their labels) in R — K to D to form H'\ so there is another 
injective morphism h : R — > H' with h(R — K) Cl D = 0; if the source of an 
edge e £ R — K is x £ Vk then SH'{h{e)) is g(x) otherwise it is h(x)\ similarly 
for targets; for every vertex x £ Vk if Il(x) / Ir{x ), the label of g(x) in H' 
becomes Ir(x). □ 

Injectivity of the matching morphism g means that BtoL in Fig. 1 is only 
applicable to a graph in which some i?-labelled node has left and right arcs 
to distinct L-labelled nodes; the dangling condition means the L-labelled nodes 
must have no other in-arcs and the £?-labelled node may have in-arcs. 

If H = G or H is derived from G by a sequence of direct derivations through 
rules in set £#■ we write G H or G =>* H. If no graph can be directly derived 
from G through a rule in & we say G is irreducible. Definitions 2 and 3 
are too general for modelling data structures because the outdegree of nodes is 
unlimited, and graphs and rules can disrespect the intentions of our signatures. 

Example 4 (Unrestricted graph reduction is too general) 

Fig. 4 shows a simple rule Rel which relabels a node, and an example derivation 
in which the relabelling results in a graph containing a leaf with a child. Un- 



Specifying Pointer Structures by Graph Reduction 



35 



Rel : 



■® 





=4 Rel 



s r /-—v 




Fig. 4. A rule Rel, which does not respect the BT signature, and the effect of applying 
it to a graph which does respect the BT signature 



restricted rules could make trees cyclic or give branches multiple left-children. 



Definition 5 (Outlabels and U-graph) 

The outlabels of node v in graph G are the set of labels of edges whose source 
is v : outlabels gW) = {niG(e) | sc{e) = «}. 

A graph G respects E, or G is a 17-graph for short, if: (1) Ve,e' G Eg • s<y(e) = 
SG(e') => mc(e) ^ m,G(e')\/e = e! and (2) \/v G Vq-IgW) -L =>■ outlabels gW) Q 
type ( Ig(v )). Note the set of 17-graphs is closed under subgraph selection. □ 

Every node has at most one outgoing edge with any given label, and the 
outlabels of a node labelled l form a subset of the type of l. 

Definition 6 (17-total graphs) 

A 17-graph G is E-total if Iq is total and for every node v G Vg, outlabels g{v) = 
type (Ig{v)). □ 

A A-total graph models a data structure: all its nodes are labelled and each 
node has a full set of outlabels. Apart from these restrictions nodes may be 
connected to others in the same graph arbitrarily. In this paper we do not model 
nil pointers. Alternatives are considered in [2]. Non-total 17-graphs are used in 
rules where it is essential, or convenient, to have unlabelled nodes and missing 
outlabels. 

Example 5 ( Ebt and AeT-total graphs) 

In the right half of Fig. 4, the left graph respects Ebt and the right graph does 
not. In Fig. 3 both graphs are AsT-total. □ 

To prevent reduction rules breaking either the signature or the totality of 
graphs we define a simple restricted rule form: 17-total rules. 

Definition 7 (17-total rule) 

A rule (L D K C R) is a E-total rule if L, R are 17-graphs and for every node x: 

1. Il{x) = -L => x G Vk A Ir{x) =1A outlabels l{x) = outlabels r(x) . 

That is, unlabelled nodes in L are preserved and remain unlabelled with the 
same outlabels. 

2. x G Vk A Il{x ) ^ 1 A Il{x) = Ir(x) => outlabels l(x) = outlabels r(x) . 

That is, labelled nodes in L which are preserved with the same label have the 
same outlabels in L and R. 



This motivates the following restrictions. 



□ 
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Fig. 5. BBT shape specification rules 



3. x G Vk A Il(x) yf -L A Il{x ) 7 ^ Z,r(:e) => 

Zfl(ir) 7 ^ _L A outlabels l{x) = type ( Il{x )) A outlabels r(x) = type (Ir(x)). 

That is, relabelled nodes have a complete set of outlabels in L and R. Nodes 
may not be labelled in L and unlabelled in R, or vice versa. 

4 . x € Vl — Vk => outlabels l(x) = type ( Il(x )). 

That is, deleted nodes have a complete set of outlabels. 

5. x € Vr — Vk => Ir(x) /1A outlabels r(x) = type (Ir(x)). 

That is, allocated nodes are labelled and have a complete set of outlabels. □ 

Example 6 (Rules specifying balanced binary trees) 

Example 7 specifies BBTs with the Sbt~ total rules 3#bbt — {PickLeaf , 
PushBranch, FellTrunk}, given in Fig. 5. PickLeaf replaces a binary branch of 
leaves by a unary branch of a leaf; PushBranch forces a binary branch of unary 
branches one level down, it applies anywhere in a tree. Note that both rules 
preserve height and balance. FellTrunk removes unary branches which are not 
the target of any arcs, it preserves balance but decreases height. □ 

Theorem 1 (.E-total rules preserve E and 17-totality) 

Let r be a 17-total rule and G => r H a direct derivation on graphs over E. 
Then G is a 17-graph iff if is a 17-graph. Moreover, G is 17-total iff H is 17-total. 

□ 



Definition 8 (GRS, NT-free GRS) 

A graph reduction specification (GRS) S = ( E , Acc) consists of a signature 
E , a finite set of 17-total rules 2% and an ^-irreducible A-total graph Acc, the 
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Fig. 6. A reduction of the right graph in Fig. 3. The steps in the four reduction 
sequences are: PickLeaf, PickLeaf; PushBranch, PickLeaf; PushBranch, PushBranch, 
PickLeaf; FellTrunk, FellTrunk, FellTrunk 



acceptinq qraph. The qraph lanquaqe of S is J*?( S) = {G I G Acc A ln(Vn ) f~) 
= 0}- If = 0 we say that S is NT- free. ’ □ 

Termination and closedness are discussed in Section 3. Note that Acc is 27- 
total, so every graph in Jzf(S') is 27-total by Theorem 1. 

Example 7 (Specification of balanced binary trees) 

We define BBTs by the NT-free GRS BBT = (£bt,&bbt, Accl), where A$bbt 
is defined in Example 6. That is, &bbt reduces BBTs, and nothing else, to 
Accl- Fig. 6 shows an example reduction. The left graph in Fig. 3 is irreducible 
under &bbt, owing to the various forms of sharing it contains, and therefore is 
not a BBT (it is a balanced binary DAG); the right graph is a BBT. □ 

Theorem 2 ( BBT specifies balanced binary trees) 

For every AeT-graph G, G £ J£(BBT) iff G is a balanced binary tree. □ 



3 Membership Checking 

Graph reduction rules are just reversed graph-grammar production rules so re- 
duction specifications can define every recursively enumerable set of 27-total 
graphs (that exclude the empty graph, see [2]). This follows from Uesu’s result 
that double-pushout graph grammars can generate every recursively enumer- 
able set of graphs [18]. Consequentially, arbitrary reduction rules can specify 
languages with an undecidable membership problem. 

For testing example structures we need specifications for which language 
membership can be checked - - preferably in polynomial time. Therefore we will 
require that GRSs are polynomially terminating and their languages closed un- 
der reduction. Testing membership of such languages is simple: given a graph G, 
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check that G only has terminal labels and apply the rules in Si (nondeterministi- 
cally) as long as possible; G belongs to Jz? (S) iff the resulting graph is isomorphic 
to Acc. First we consider termination. 

Definition 9 (Graph size, polynomially terminating, size-reducing) 

Graph size is defined by size(G) = #Vg + #A'g where # denotes set cardinality. 
A GRS S = {E,Si, Acc) is terminating if there is no infinite derivation G o 
G i =>.*•••• It is polynomially terminating if there is a polynomial p such that for 
every derivation G G i G n , n < p(size(G)). It is size-reducing 

if size(L) > size(R) for every rule (L D K C R) in Si. □ 

The example specifications in this paper have linear reduction lengths; this 
is usually easily shown, but there is no general decision method, so new GRSs 
may require individual termination analysis. For example, BBT is size-reducing, 
while RBT (Section 5) reduces the natural number size(G) + #{i> | Ig(v) = B} 
at each step. Now we consider closedness and complexity. 

Definition 10 (Closedness, Confluence, PGRS) 

A GRS S = {H, Si, Acc) is closed if for every direct derivation G H, 
G Acc implies H Acc. S is confluent if for every pair of deriva- 
tions Hi j^= G =>gg H 2 over H, there is a graph H such that Hi H j^= H 2 . 
A polynomially terminating and closed GRS is a polynomial GRS (PGRS). □ 

Confluence implies closedness (the converse does not hold). Confluence of 
a terminating specification can be shown by adapting the critical pair method 
of [17] to GRSs (see [2]). All examples in this paper are confluent by this method 
as all their critical pairs are strongly joinable. Two reduction rules form a crit- 
ical pair if they can be applied to the same graph such that one rule removes 
part of the graph required to apply the other rule. Closedness can be tested by 
disregarding any critical pair which only occurs as part of non-language member 
graphs. 

Theorem 3 (Complexity of testing membership) 

If S' is a PGRS then membership of 2z?(S) is decidable in polynomial time. □ 

We assume S is fixed, so the number of rules is fixed and the size of the 
largest left graph in Si is a constant c. Checking whether any rule in Si matches 
a graph G requires 0(size(G) c ) time. This is because there are at most size(G) c 
injective mappings Vl —> Vg for any left graph L , and checking whether a map- 
ping induces a graph morphism L — > G and the dangling condition can be done 
in constant time if graphs are suitably represented. Given a match, rule applica- 
tion is constant time. Hence the procedure sketched in the introduction to this 
section runs in polynomial time. The procedure is correct as the closedness of S 
makes backtracking unnecessary. 
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4 Extensions and Closure Properties 

NT-free PGRSs are powerful but there are still lots of shapes they cannot de- 
scribe; PGRSs are more powerful and GRSs have the universal specification 
power of graph grammars. This section develops the idea of classifying the sim- 
plicity of shapes by showing whether they have an NT-free specification or not. 
We show that: intersection extends the range of shapes definable by NT-free 
(P)GRSs to all the (P)GRS-definable shapes, and that (P)GRSs are closed un- 
der intersection; union extends the range of shapes definable by NT-free PGRSs 
and PGRSs, but terminating and possibly non-confluent GRSs are closed under 
union (provided Acc ^ 0); complement extends the range of shapes definable by 
NT-free (P)GRSs and (P)GRSs. 

Complete binary trees (CBTs) are BBTs where every branch is binary. The- 
orem 4 says they cannot be defined by an NT-free GRS. Lemma 1 presents 
a general method for showing that an NT-free GRS cannot define a given shape. 

Lemma 1 (Proving graph languages are undefinable) 

Graph language cannot be defined by an NT-free GRS if: 

VfceN,fcy x JSf-ma x{S(G,H) | (G, H) € J?} > k V St ^ Jzf x 

where S(G, H) = min{max{ size (L), size (R)} \r=(LDKCR)AG => r H}. □ 

To use Lemma 1 we show that for every k there is a graph G e if which 
cannot be rewritten to some other graph H £ A? without a rule of size at least k. 

Theorem 4 (CBTs cannot be defined by an NT-free GRS) 

No NT-free GRS can specify complete binary trees. □ 

We can often make a language specifiable by using non-terminals. Alterna- 
tively, we can take the intersection or union of two NT-free GRS languages. 
We show that using non-terminals is equivalent to using intersection and hence 
GRSs are closed under intersection. The following examples give non-terminal 
and intersection specifications of CBTs. 

Example 8 (Specification of complete binary trees) 

Let CBT = (Set + ({}, {G}, {}, {}),@bbt, Acc b ). Hence CBTs are BBTs 
which do not contain any unary branches. □ 

Example 9 (CBTs by intersection) 

Let jJf(C'BT) = J?(FBT) nA^(BBT). CBTs are full binary trees (left conjunct, 
Example 1). CBTs are balanced (right conjunct). Both GRSs are NT-free. □ 

By Theorem 4 and Example 9, the languages of NT-free (P)GRSs are not 
closed under intersection. Theorem 5 shows that (P)GRSs and intersections of 
NT-free (P)GRSs have equivalent power. Theorem 6 shows that (P)GRSs are 
closed under intersection. 
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Theorem 5 (GRSs equivalent to intersections of NT-free GRSs) 

1. If N is a GRS there are NT-free GRSs S and T s.t. Jz f(N) = Jzf(S') D 2z?(T). 
Further, if N is a PGRS then so are S and T. 

2. If S and T are NT-free GRSs there is a GRS N s.t. 2z f(N) = JZf(S) fl Jjf(T). 

Further, if S and T are PGRSs then so is N. □ 

Theorem 6 (Graph reduction languages closed under intersection) 

If S and T are (P)GRSs, then j£f(S) fl 2zf(T) can be defined by a (P)GRS N. □ 

Language union offers another way to compose specifications. It is easy to 
see that union extends the range of languages specifiable by PGRSs and NT- 
free PGRSs. For example, a GRS cannot define a finite language that includes 
the empty graph and some other graph, but such a language is easily specified 
as a union of PGRSs with no reduction rules whose accepting graphs are the 
language elements. Similarly, PGRSs are not closed under union for infinite 
languages with or without the empty graph. If we allow terminating but possibly 
non-confluent GRSs, we can show that they are closed under union, provided 
their languages exclude the empty graph (see [2]). 

GRS languages are not closed under complement. This follows from the abil- 
ity of reduction specifications to simulate Chomsky grammars. 

5 Red-Black Trees 

This section applies the theory to specify red-black trees (RBTs). Our speci- 
fication in Definition 12 is interesting because it is an NT-free PGRS but is 
not size-reducing. Theorem 7 says that a size-reducing RBT specification needs 
non-terminals (using a simplification of Lemma 1; see [2] for the proof and a size- 
reducing specification with non-terminals). 

Definition 11 (Textbook red-black tree definition [ 6 ]) 

Red-black trees are trees of binary-branches and leaves where branches are la- 
belled red or black, children of red branches are black or leaves and all paths 
from root to leaf have the same number of black nodes. □ 

Theorem 7 (A size-reducing GRS of RBTs needs non-terminals) 

Red-black trees cannot be specified by a size-reducing NT-free GRS. □ 

Definition 12 (Specification of red-black trees) 

Let Srbt = ({R, -B, £},{},{/, r}, {1? {l,r},B ^ {l,r},L !-*• {}}) and 

RBT = (Erbt, &rbt, Accl ), where Fig. 7 shows the reduction rules in &rbt 
and Fig. 1 shows Accl ■ □ 

Note that RBT is not size reducing but it linearly terminates as every reduc- 
tion step reduces size(G) + #{u £ Vq \ Ig(v) = B}. 

Theorem 8 (Correctness of RBT) 

For every Srbt~ graph G, G £ J£(RBT) if and only if G is a red-black tree. □ 
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PickRedLeaf : 

x £ {l, r} 



PushRedRoot : 

x £ {B , -I/} 



PushRedBranch : 

x £ {l, ?'} 



ReddenRoot : 



FellStump : 

x £ { R , B} 




Fig. 7. Red-black tree reduction rules 



Each rule preserves the red-black properties and produces either a smaller 
or a redder tree (therefore &rbt terminates). The smallest RBT is a leaf. We 
can think of the tree reduction process as follows. PickRedLeaf can remove 
any red leaf-parent with a black parent. Any red node higher up the tree can 
be pushed by the tree by recolouring it and its children as in PushRedRoot or 
PushRedBranch, provided that its grandchildren are black or leaves. These rules 
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alone produce a complete black tree. The root can be coloured by ReddenRoot, 
safely reducing the black height, and then pushed down and picked by the other 
rules. Eventually we reach a singleton which is rewritten to Accl by FellStump. 



6 Related Work 

In functional programming, Nested types can be used to specify perfect binary 
trees [13]; however, these are only complete balanced binary DAGs as they do 
not preclude sub-tree sharing. 

The following papers specify shapes using variants of context-free graph 
grammars, or certain logics. They can all specify trees, but none tackle the 
problem of specifying non-context-free properties like balance. ADDS [12] speci- 
fies structures by a number of dimensions where arcs are restricted to point away 
from, or towards, the root in a specified dimension. It can also limit node inde- 
gree. The logic of reachability expressions [5] allows the reachability, cyclicality 
and sharing properties of pointer variables to be specified as logical formulae. It 
is decidable whether a structure satisfies such a specification (but the complexity 
is unclear) and the logic is closed under intersection, union and complement. In 
role analysis [15] the shapes of pointer data structures are restricted by specify- 
ing whether pointers are on cyclic paths and by stating which pointer sequences 
form identities. The number and kind of incoming pointers are also specified. An 
algorithm verifies programs annotated with role specifications. Graph types [14] 
are recursive data types extended with routing expressions which allow the tar- 
get of a pointer to be specified relative to its source. In [16], graph types are 
defined by monadic 2nd-order logic formulae and a pointer assertion logic is 
used to annotate C-like programs with partial correctness specifications; a tool 
checks that programs preserve their graph type invariants. 

Shape types [8, 9] are specified by context-free graph grammars. They can al- 
ways be converted to equivalent GRSs [2], but the classes of context-free graph 
languages and PGRS languages are incomparable. However, PGRSs can specify 
context-sensitive shapes and we are not aware of any common data structure 
with a context-free specification and no PGRS. Shape types have a method for 
checking the shape-invariance of atomic transformations (individual pointer ma- 
nipulations) . GRSs have a similar method which can verify the shape safety of an 
algorithm; more detailed explanations are available in [3, 1]. Context- exploiting 
shapes [7] are generated by hyperedge-replacement rules extended with context; 
the precise relation to PGRSs is unclear. Membership checking is exponential but 
there is a restricted, decidable class of shaped transformation rules that preserve 
context-exploiting shapes. 

For related work on graph parsing see [4] which discusses context-free graph 
grammars and layered graph grammars which are powerful but have an expo- 
nential membership algorithm 
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7 Conclusion and Future Work 

Graph reduction specifications are a powerful formal framework, capable of defin- 
ing data structures with non-context-free properties. The examples presented 
here show how they can specify complete binary trees (Section 4) and red-black 
trees (Section 5). Many other examples, including AVL trees and grids, are given 
in [2]. The GRS tool available from [1] implements GRS checking including con- 
fluence, membership and operation checks. 

We intend to develop programming languages which offer safe pointer ma- 
nipulation based on GRSs. We are investigating two approaches. 

1. A new pointer programming paradigm. Algorithms will be described as oper- 
ations on graphs with data fields; the shapes of intermediate structures will be 
specified or inferred and checked. Checking is undecidable in general; we plan 
to investigate its feasibility on practical examples, the method described in [3] 
is a starting point. For operations like insertion into red-black trees [6] a better 
checker will be required, and possibly more informative specifications, because 
the current checker is often non-terminating on non-context-free shapes. 

2. An imperative programming language. Combining conventional pointer manip- 
ulation with types specified by GRSs: pointer algorithms will be abstracted and 
then checked as in the first approach. Here the main challenge is to fit the op- 
erational semantics of a garbage-collected imperative language to the semantics 
of double-pushout graph rewriting. 
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Abstract. Software engineering applications, like integrated develop- 
ment environments or CASE tools, often work on complex documents 
with graph-like structures. Modifications of these documents can be re- 
alized by graph transformations. Many graph transformation systems 
operate only in volatile memory and thus suffer from a couple of draw- 
backs. 

In this paper, we present the graph model of the Gras/GXL database 
management system. Gras/GXL enables graph based applications to 
store their graphs persistently in commercial databases. Because these 
applications usually have their own graph model, a mapping from this 
graph model to the Gras/GXL graph model has to be realized. We will 
present mappings for the PROGRES, DiaGen, and DiaPlan graph 
models in this paper. For the PROGRES graph model we will also show 
its realization. 



1 Introduction 

Integrated development environments, visual language editors, and reengineer- 
ing applications often use graphs or graph-like structures, as data structures 
for their documents. These graphs contain entities of different types on differ- 
ent levels of abstraction. Besides accessing these entities efficiently, applications 
perform complex queries and operations on the graphs — e.g. for ensuring con- 
sistency constraints, enumerating affected and related entities, or removing a net 
of related entities. 

Nowadays, most of the applications mentioned before are still implemented 
manually. Specifying the application logic with graph transformations is a more 
elaborate approach. The logic is no longer implemented by hand but by using 
a textual or graphical specification. Now, the developer can concentrate on the 
solution of the problem and ignore implementation specific details. Moreover, 
specifying complex operations and queries visually is more convenient than im- 
plementing the operations manually, modifying them later, and keeping track of 
all dependencies. 

Some graph transformation systems can even generate code for languages 
like C++ or Java from the specification. Applications are then built on top 
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of the generated code. Many examples show that this approach performs well: 
Based on the graph transformation system PROGRES [21] and the prototyping 
framework UPGRADE [3], different groups at our department developed the 
AHEAD [11] and E-CARES [15, 16] prototypes, among others. Similar results 
were achieved by other groups: DiaGen [17] is used for realizing editors for 
visual languages like state chart diagrams [18]. Fujaba [12] is used for simulating 
production control systems. 

Many graph transformation systems store their graphs in volatile memory. 
At each system start the graph is loaded and must be stored before the appli- 
cation exits. The drawbacks of this approach are obvious: Data are lost when 
the application crashes, the graph size is limited, database functionality must be 
reimplemented, etc. Let us discuss these problems in detail using two examples 
from our department. 

The E-CARES reengineering environment analyses the source code, runtime 
traces, and other sources of information of a telecommunication system to re- 
cover its architecture. The initial steps of the analysis create huge graphs of 
about 300,000 nodes per subsystem 1 . For inter-subsystem communication, mul- 
tiple subsystems have to be analysed. Storing these graphs in main memory has 
a couple of disadvantages: they are too huge, recovering lost information is very 
expensive 2 , and system startup and shutdown times slow down significantly. 

Another example is AHEAD, an environment for the administration of de- 
velopment processes. AHEAD features an agenda containing the tasks assigned 
to each developer. The developers use a distributed application to perform the 
tasks to which they have been assigned. Ensuring the data consistency of the 
distributed application is a very complicated task and many problems solved in 
database management systems before must be solved again. Both applications 
demand a graph database, such as GRAS. 

When the development of the GRAS database management system started 
in 1984, common database management systems lacked many features required 
for the implementation of software development environments — for example, 
storing graphs efficiently, offering undo / redo of graph modifications, graph 
change events, etc. To obtain these features, problems solved in every database 
management system had to be solved again — transaction management, efficient 
data organization, concurrent access, data consistency, etc. PROGRES and all 
prototypes built with it, as well as the 1PSEN environment [20], utilize the GRAS 
database. 

However, databases have changed since the early days of GRAS, and the costs 
for maintaining and extending GRAS have increased. Moreover, our experience 

1 Because of GRAS’ current 65,000 node limit only the most important informations 
are stored in the graph at the moment. A recent modification of GRAS — which 
supports roughly 1,000,000 nodes — reduces the problem only slightly because the 
overall storage space could not be increased. 

2 A recovery may require the re-analysis of a significant amount of documents which 
may have to be created again. Moreover, architecture modifications reflecting reengi- 
neering decisions may be lost. 
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in building prototypes with PROGRES and UPGRADE showed that the graph 
model of PROGRES is not sufficient for specifying some problems appropriately. 
As the PROGRES graph model is tightly coupled to the GRAS graph model, 
enhancements should be made to the GRAS graph model, which are not feasi- 
ble. Thus, the implementation of the new GRAS database management system, 
Gras/GXL, has been decided. Whereas the proven useful functionality of GRAS 
must remain, more flexibility has to be added to the system. 

Instead of a fixed graph model the new Gras/GXL database provides a graph 
model on top of which concrete graph models are realized. The advances in 
database management systems and standardization of service interfaces make it 
possible to reduce the development effort dramatically by building on third-party 
components, such as commercial database management systems or transaction 
managers. Besides a reduction in development efforts we expect increased reli- 
ability and stability, due to enhancements in the components we use. Because 
of standardized service interfaces, one component can be replaced by another 
more easily. The ability to replace components contributes to the scalability of 
Gras/GXL: For small prototypes a small and main memory database can be 
used for storing graphs. For applications which have to handle a huge amount 
of data — like E-CARES — commercial database management systems can be 
used. 

The Gras/GXL database management system allows graph transformation 
systems to store their graphs persistently in different commercial database sys- 
tems. Because virtually every graph transformation system has its own graph 
model, the predecessors of Gras/GXL could not be utilized by them. However, 
the Gras/GXL graph model is adopted to match exactly the graph model of the 
graph transformation system. The graph transformation system can now use a 
commercial database without undergoing major modifications. 

In this paper, we present the graph model used by the successor of GRAS, 
the Gras/GXL database management system. Section 2 presents the specific 
graph models of DiaGen, DiaPlan, and PROGRES for which we will define 
a mapping in Section 4. Our discussion of the PROGRES graph model will show 
which situations cannot be specified appropriately and outline their influence on 
the specification and the prototype. In Section 3, we present the graph model, 
its origin, and relate it to the GXL graph model [10]. A mapping of the graph 
models discussed in Section 2 to the graph model of Gras/GXL is presented in 
Section 4. For the PROGRES graph model we will present the realization of the 
mapping in Section 5. Section 6 summarizes the presented results and sketches 
some open problems. 



2 Graph Models of Different Graph Transformation 
Approaches 

Before introducing the graph model of Gras/GXL we give a short overview on the 
graph models of DiaGen, DiaPlan, and PROGRES. Based on this overview 
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a mapping for these graph models to the graph model of Gras/GXL will be 
presented in Section 4. 

2.1 Graph Models of DiaGen and DiaPlan 

DiaGen generates diagram editors for a specific visual language from a textual 
specification. The resulting editors support free hand editing of diagrams in con- 
formance with the corresponding visual language. If the resulting visual language 
is executable — like state charts — the diagrams can be animated. The formal 
background of DiaGen are hypergraphs and hypergraph transformations. 

The hypergraphs used by DiaGen are a generalization of directed graphs 
as explained in [17]. A hypergraph consists of a set of labeled nodes , a set of 
directed, labeled edges , and a set of labeled hyperedges. The labels in hypergraphs 
correspond to types of ordinary programming languages. An edge connects , or 
visits, two nodes which may be identical. Hyperedges have a fixed number of 
labeled tentacles , which are determined by the label of the hyperedge. The ten- 
tacles connect nodes to the hyperedge. The order of the tentacles, and thus, the 
order in which the nodes are visited, is determined by the hyperedge’s label. 
Obviously, edges can be represented by binary hyperedges. Thus, it is a matter 
of style whether edges are used or not. 

DiaPlan [7] is a visual rule-based diagram programming language that 
allows more complex animations than DiaGen. The computational model of 
DiaPlan are shapely nested hierarchical graph transformations. For DiaPlan 
the graph model of DiaGen has been extended by hierarchies. Edges as well as 
hyperedges may contain at most one hypergraph and are then called a frame. Of 
course, a hypergraph contained in a frame can contain other frames. Thus, the 
resulting hierarchy has a tree-like structure and there is always exactly one top 
graph. Hoffmann extended the role of nodes in [8] and [9]: nodes may contain 
hypergraphs as well, and special external nodes, called points , were introduced. 
Points are nodes at which a graph may be connected to other graphs and which 
can not contain a hypergraph. Note that edges crossing frame boundaries are 
forbidden. The integration of DiaPlan into DiaGen for generating diagram 
editors is planed. 

2.2 Graph Model of PROGRES 

PROGRES is used at our department to build the application logic of pro- 
totypical applications. The application domains of the prototypes range from 
conceptual design in civil engineering [13] to project management (AHEAD). 
The PROGRES graph model is based on directed attributed graphs. A graph 
consists of a set of labeled nodes and a set of directed labeled edges. Only the 
nodes are first class objects and may carry attributes. 

Edges do not have an identity and serve as binary relationships only. At- 
tributed edges, edges on edges, n-ary relations, or hierarchical graphs are not 
supported. But, our experience in building prototypes with PROGRES showed 
that these constructs are needed in virtually every specification. Fortunately, 
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the PROGRES graph model is capable to emulate all constructs mentioned be- 
fore. For example, an attributed edge can be emulated by an edge-node-edge 
construct. One edge points to the source of the attributed edge, the other edge 
points to the target. The node carries the attributes. Hierarchical graphs can be 
emulated by specially typed edges. 

The influence of emulating attributed edges and other constructs on the spec- 
ification style and prototype creation cannot be neglected. Because edge-node- 
edge constructs are harder to use in visual graph tests and graph transforma- 
tions — even if path expressions are used — textual transformations dominate 
the specification. When a prototype is built based on the code generated from 
the specification, the edge-node-edge constructs have to be transformed at the 
UI level to an attributed edge. Otherwise, a user has difficulties understanding 
the diagrams presented by the prototype 3 . Mapping every single instance of an 
edge-node-edge construct to an attributed edge is very expensive and results 
in slower updates of the user interface. The Gras/GXL data model solves these 
problems when supported by PROGRES and the UPGRADE framework as we 
will see in Section 4. 



3 Graph Model of Gras/GXL 

The previous discussion on PROGRES, GRAS, and the prototypes built at our 
department outlined the disadvantages of the present GRAS version — an in- 
flexible graph model, limited graph size, and maintenance problems. 

In this section, we present the graph model of Gras/GXL. One major differ- 
ence between GRAS and Gras/GXL is that GRAS defines a single graph model 
for all applications. In contrast, the Gras/GXL graph model is never used di- 
rectly by an application. On top of the Gras/GXL graph model, a specific graph 
model is defined — for example a DiaGen graph model — which is then used 
by an application. 

Before introducing the Gras/GXL graph model, we give a short overview on 
the GXL graph model, as it served as the starting point of our graph model 
development. 



3.1 GXL Graph Model 

GXL has been widely accepted as a standard format for exchanging graphs. It 
is used by many reengineering applications, like Rigi [19] and Gupro [14], and 
graph visualization tools, like JGraph [1] and UPGRADE [3]. 

The graph model of GXL is shown in Figure 1. A GXL document stores 
several graphs. Graphs may be typed and can have an arbitrary number of 
attributes. Attribute types are defined by a sublanguage which is not covered in 
this article. 

3 Imagine a UML class diagram where the roles and cardinalities are nodes connected 
to the association by edges instead of labels. 
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Fig. 1 . GXL graph model (from [6]) 



Every graph belongs to exactly one GXL document and can be identified by 
a user defined identifier, called a role. A graph contains an arbitrary number of 
graph elements (nodes, edges, and relations), where each graph element belongs 
to exactly one graph. Relations represent n-ary relationships between graph el- 
ements. They can also be used to express nodes (unary relations) and edges 
(binary relations). But, this representation of nodes and edges is not convenient 
for most users. Therefore, the creators of GXL decided to provide nodes and 
edges as primitive — or atomic — constructs. Edges as well as relations can be 
ordered. Relation ends cannot be typed but they may carry attributes. Just like 
graphs, relation ends can be distinguished from each other by user defined roles. 
Hierarchical graphs are created by placing an arbitrary number of graphs inside 
a graph element. Like graphs, graph elements can have a type and attributes. 

Graph elements (i.e. , nodes, edges, and relations), but not graphs, can be 
connected by edges and — through relation ends relations. Applications that 
require edges between graphs have to emulate these inter-graph edges — for 
example, by using nodes which contain graphs. Another limitation, which is 
not expressed in the graph model, is that only elements within the same GXL 
document can be inter-connected 4 . 

4 The common super class of edges and relations, LocalConnection, can be regarded 
as a realization of this constraint. 
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Fig. 2. Graph model of Gras/GXL 



Referencing graph elements or graphs within a graph is not allowed, yet. A 
change request that addresses this open issue is still pending. At the moment, 
references to graph elements can only be realized by using edges that have special 
user-defined types. Also, references to graphs can not be realized directly because 
the GXL graph model does not permit edges to graphs. 

Another option for realizing references to graphs or graph elements are at- 
tributes. But, as the attribute sublanguage of GXL does not support attributes 
with graphs or graph elements as values, these references are user-defined. 

Both issues are addressed by the upcoming GXL release 1.1 for which a pre- 
liminary document type definition (DTD), but no formal graph model, is avail- 
able. This release provides graph references as well as graph- valued attributes. 
References to graph elements will not be supported. 

Another limitation that has not been addressed so far is direct nesting of 
graphs. The GXL graph model demands that graphs can only be contained in a 
graph element or the surrounding GXL document. 

3.2 Gras/GXL Graph Model 

The discussion of the GXL graph model made clear that this model provides a 
good starting point for the graph model of the Gras/GXL database management 
system. However, the discussion also showed that some features, which may 
be used by certain graph models, are not supported, yet. In the following, we 
will discuss the Gras/GXL graph model (see Figure 2) in detail and present 
its differences compared to the GXL graph model. Despite the extensions and 
modifications the Gras/GXL graph model is able to express the same graph 
classes as the GXL graph model. 
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As in a GXL document, an arbitrary number of graphs are stored in 
a Gras/GXL graph pool. Each graph can be identified by its role. GXL re- 
gards graphs and graph elements as different entities. As a drawback of this 
approach, graphs can not be connected to each other by edges (or relations). In 
Gras/GXL we regard graphs as just a special kind of graph element, like nodes 
or relations. This is a substantial difference as we shall see. 

A Gras/GXL graph contains an arbitrary number of graph elements — nodes , 
edges , relations, and graphs. In GXL, edges and relations can only connect graph 
elements within the same document. The Gras/GXL graph model permits edges 
and relations connecting graph elements stored in different graphs, even in phys- 
ically different databases 5 , which is not supported by GXL but important for 
connecting elements in different graphs. As explained before, graphs are just 
ordinary graph elements in our graph model. Thus, they can be visited by edges 
and relations directly without using special graph elements. As in the GXL graph 
model, edges and relations can be ordered. 

Neither the current nor the upcoming GXL version support references to 
graph elements. References to graphs are only supported by the upcoming GXL 
1.1 version. Our graph model allows a graph to reference other graph elements. 
We decided not to restrict them to graphs — as GXL does — for two reasons: 
(1) references to graph elements allow for a much greater flexibility and (2) as 
graphs are a special kind of graph element this comes naturally. 

Hierarchical graphs are created either by a containment relationship or by 
graph- valued attributes. The containment relationship is used if a graph should 
be contained in another graph. Graph-valued attributes are used in all other 
situations, for example if a relation should contain a graph. The use of graph- 
valued attributes together with the containment relationship allows us to create 
arbitrary hierarchical graphs. Although this approach seems to be complicated 
it allows us to handle even complex situations uniformly — like hierarchies of 
graphs stored in different databases — which is more important than convenient 
usage of the graph model. The result is a clean and efficient realization of graph 
hierarchies. 

Graph elements and relation ends may have an arbitrary number of at- 
tributes, just as in GXL. We do not offer a separate language for attribute 
definitions. Instead, the graph schema of the application will be defined using 
Java classes. Hence, attributes can be of any serializable Java type. 

A graph element (including graphs) must have a type, whereas in GXL graphs 
and graph elements may be typed. Although untyped graphs are common in 
some application domains, we do not support them in the graph model for the 
following reasons: First, complex schema-based operations can be realized with 
the help of the underlying database more efficiently. Second, a concrete graph 
model can create an implicit schema for schema-less graph models. 

Summarizing, the graph model of the Gras/GXL database system solves 
the problems identified for GRAS before. As our graph model is not targeted to- 
wards a particular graph model, only some consistency checks can be performed. 

Not yet supported by the existing implementations of the graph model. 
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Fig. 3. Class diagram for the PROGRES graph model 



Most consistency checks — for example, the number of subgraphs allowed within 
a graph element — must be performed by the implementation of a concrete graph 
model. The next section will explain how concrete graph models are realized 
based on Gras/GXL’s graph model. 

4 Mapping Different Graph Models to Gras/GXL 

One aim of the Gras/GXL database management system is the support of dif- 
ferent graph models. The implementation of these graph models and the appli- 
cations that use them can rely on the infrastructure provided by Gras/GXL — 
for example, incremental attribute evaluation or graph queries. As mentioned 
before, a drawback of this approach is that the graph model can only perform 
limited consistency checks — existence checks for types or graph elements, etc. 
More complex checks (like cardinality checks) or prohibiting the use of certain 
graph elements — for example, references to graph elements — have to be shifted 
to the realization of a concrete graph model. 

4.1 Mapping for the PROGRES Graph Model 

The PROGRES graph model is simple enough to present the steps which are 
necessary to map a certain graph model to the Gras/GXL graph model in a 
paper. In some cases, only a mathematical or textual description of the graph 
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visits 

Fig. 4. Class diagram for DiaGen’s graph model 



model is available. Then, the first step is the creation of a class diagram based 
on this description. Figure 3 presents the PROGRES graph model we deduced 
from the description in Section 2.2. 

The next step should be to add methods to the class diagram which are used 
to create and modify graphs. But, we will skip this step for the sake of brevity. 

Now, we define the mapping of the PROGRES graph model to the 
Gras/GXL graph model. In this case a simple one-to-one mapping is sufficient: 
Graphs are mapped to graphs, nodes to nodes, and edges to edges. The imple- 
mentation of all classes — for example, Graph — is backed by the corresponding 
classes of the graph model. 

Of course, creating the mapping is not enough. The constraints defined for 
the PROGRES graph model must be ensured by the realization of the mapping. 
For example, before an edge connects two nodes the realization must check if 
the types of the nodes are compatible with the types allowed for the source 
and target node of the edge. In Section 5 we will discuss the realization of the 
PROGRES mapping in more detail. 

4.2 Mapping for the DiaGen Graph Model 

Next we define a mapping for DiaGen’s graph model to Gras/GXL’s graph 
model. We use the textual description from Section 2.1 to deduce the class dia- 
gram shown in Figure 4. A Hypergraph contains an arbitrary number of labeled 
GraphElement s — Nodes, Edges, and Hyperedges. We do not consider tenta- 
cles as graph elements that should be contained in the hypergraph, because 
they are part of a hyperedge. Thus, the class Tentacle is not a sub-class of 
GraphElement , but of Lab el edEl ement , because tentacles have a label. The or- 
der of tentacles is of importance, which is expressed by the constraint {ordered} 
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Fig. 5. Class diagram for DiaPlan’s graph model 



for the association tentacles between Hyperedge and Tentacle. The associa- 
tion visits expresses that a tentacle visits exactly one node and that a node 
can be visited by several tentacles. The same applies to the associations source 
and target between the classes Node and Edge. In addition, we introduce the 
class GraphPool which contains an arbitrary number of hypergraphs. Thus, our 
database can store more than just one hypergraph. 

After deducing a class diagram which corresponds to the description of 
DiaGen’s graph model, we can define the methods necessary for modifying 
hypergraphs. We consider only one operation as an example in this paper. 
The class Hyper graph provides the method createEdge (source node, target 
node, edge label) to creates an edge with the specified label from the source 
node to the target node. 

After all methods have been defined we can proceed to the next step, the 
definition of a mapping of DiaGen’s graph model to our graph model. We begin 
with the easy steps: Hypergraphs are mapped to graphs, nodes to nodes, and 
edges to edges. Hyperedges are mapped to relations and tentacles to relation 
ends. The order of the tentacles is ensured by the order attribute of the relation 
ends. The label of a tentacle is mapped to the role of the corresponding relation 
end. The method createEdge should connect edges with nodes, but not with 
other edges. This can be ensured by using appropriate parameters for the source 
node and target node parameters. 

4.3 Mapping for the DiaPlan Graph Model 

In Section 2.1 we introduced the graph model of DiaPlan as an extension of 
the DiaGen graph model. Based on the discussion of this extension we deduced 
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the graph model shown in Figure 5. DiaPlan’s graph model has two major dif- 
ferences compared to the DiaGen graph model: (1) nodes and hyperedges may 
contain a hypergraph, and (2) a new kind of node has been introduced, called 
point or external node. These differences lead to the following modifications of 
Figure 4: We introduce a new class ExtNode which represents external nodes (or 
points) which are contained in a hypergraph. External nodes can be visited by 
hyperedges — just like nodes — through their tentacles. Note, that we dropped 
the Edge class, because it can be substituted by binary hyperedges. The asso- 
ciation subgraph between the classes GraphElement and Hypergraph denotes 
that any graph element, i.e. nodes and hyperedges, may contain at most one 
hypergraph. Because external nodes are not allowed to contain a hypergraph, 
they are not a subclass of GraphElement. Remember, that a hypergraph can 
only be a top-level graph or contained in a graph element. But, a hypergraph 
can not be a top-level graph and a subgraph of a graph element at the same 
time, which is denoted by the OCL [23] constraint xor between the associations 
contains and subgraph. 

With the class diagram shown in Figure 5, a mapping to the Gras/GXL graph 
model can be defined by extending the mapping defined for DiaGen’s graph 
model before: external nodes are mapped to nodes. The hierarchies are realized 
by using graph- valued attributes. Of course, the implementations of all methods 
must ensure the consistency constraints of the graph model — for example, no 
boundary crossing edges, existence of a top-level hypergraph, etc. 

The mappings for the PROGRES, DiaGen, and DiaPlan graph model show 
that it is possible to map these graph models onto the graph model of Gras/GXL. 
Our mapping of the DiaPlan graph model showed that graph hierarchies can 
be realized with the help of graph- valued attributes. 

5 Realizing the PROGRES Mapping 

The previous section showed how mappings of specific graph models to the graph 
model of Gras/GXL are realized on the conceptual level. In this section we will 
focus on the concrete realization of the mapping for the PROGRES graph model 
based on a few examples. 

We have seen that the Gras/GXL graph model is richer than the graph 
models which are mapped on it, i.e. features not required by the PROGRES 
graph model — like n-ary relations — are provided by the Gras/GXL graph 
model. From a conceptual point of view a much simpler graph model would 
be sufficient, for example one that just provides attributed nodes and binary 
edges. Any other graph model could be mapped on such a simple graph model. 
However, such a mapping could be quite complicated and its realization would 
not be very efficient. 

Instead, the Gras/GXL graph model offers a superset of the features found 
in existing graph models. Thus, the mapping of a specific graph model to the 
Gras/GXL graph model is quite simple, as we have seen before, and the imple- 
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public ProgresEdgelD createEdge (ProgresNodelD source, 

ProgresNodelD target, 
ProgresEdgeTypelD type) 

throws EntityNotFoundException, GrasGXLException \{ 
if ( ! contains (source) I I ! contains (target) ) \{ 
throw new EntityNotFoundExceptionC ..."); 

\} 

ProgresEdgeType edgeType = schema. getEdgeTypeBy ID (type) ; 
if (! source . instanceOf (edgeType .getSourceClassIDO ) I I 
! target . instanceOf (edgeType .getTargetClassIDO ) ) \{ 
throw new GrasGXLExceptionC ..."); 

\> 

EdgelD edgelD = graph. createEdge (source .getBackingNodelDO , 
target .getBackingNodelDO , 
type . getBackinEdgeTypelD ( ) ) . getEdgelD ( ) ; 
return new ProgresEdgelD (edgelD, source, target, type); 

\> 



Fig. 6. Source code for creating an edge in the PROGRES graph model 



mentation of the mapping can be realized efficiently, as well see in the following 
examples. 

Our first example deals with the creation of an edge to connect two nodes. 
The PROGRES graph model requires that the types of the source and target 
node matches the types of the source and target node in the definition of the edge 
type. Of course, the source and target node must exist in the graph and the edge 
type must have been declared in advance. Our implementation of the mapping 
must check the following things before the edge can be created: existence of 
source and target node, existence of the edge type, and the type conformance of 
the source and target node. After ensuring that these conditions hold, the edge 
can be created between the two nodes. Since the application should not be aware 
of the existence of the Gras/GXL graph model we create an identifier which is 
specific for the PROGRES graph model and return it to the application. The 
source code in Figure 6 shows exactly these steps. 

The realization of the method is straightforward and not very surprising. 
Besides its simplicity, the benefit of having a rich graph model is not obvious. 

The following example illustrates the benefits of having a rich graph model 
like the Gras/GXL graph model. In this example we implement a method which 
returns all nodes in the graph which are an instance of a node class, including all 
sub-classes. Because the Gras/GXL graph model is aware of graph schemas with 
node classes and inheritance, we can realize this method easily and efficiently. 
Our realization can directly use the method provided by the Gras/GXL graph 
model for this task, which is able to delegate most of parts of the query to 
the underlying database — e.g., a couple of joins in the case of a relational 
database. The only task which is left to our realization, shown in Figure 7, 



58 



Boris Bohlen 



public Collection getAHNodesOf Class (ProgresNodeClassID nc) 
throws GrasGXLException \{ 

Collection retSet = new HashSetO; 

Collection set=graph. get AllNodesOf Class (nc . getNodeClassIDO ) ; 
for (Iterator i = set . iterator () ; i.hasNextO; ) \{ 

NodelD nid = (NodelD) iter.nextO; 

String typeName = nid. getNodeClassIDO .getNameO ; 
ProgresNodeTypelD ntid=schema.getNodeTypeIDByName (typeName) ; 
retSet . add (new ProgresNodeID(nid, ntid)); 

\} 

return retSet ; 

\} 



Fig. 7. Source code for retrieving all nodes matching a specific node class 



is the conversion of Gras/GXL node identifiers to PROGRES node identifiers. 
Without Gras/GXL’s rich graph model the realization would be much more 
complicated and less efficient, because we would have to implement our own 
schema management. 

The experience gained during the realization of the mapping for PROGRES 
graph model led us to the conclusion that it should be able to generate the 
mapping based on an UML class diagram, constraints, and possibly other in- 
formations. At the moment we are investigating this issue. General criteria for 
a good mapping have not been defined up to now and are also part of our future 
work. 

6 Conclusion and Future Work 

In this paper we presented the Gras/GXL database management system and its 
graph model. We explained how different graph models can be implemented on 
top of the graph model — namely the graph models of PROGRES, DiaGen, and 
DiaPlan. The realization of these graph models can be used to add persistency 
capabilities to these graph transformation systems. 

The Gras/GXL database management system is still in its early stages. Cur- 
rently, the implementation of the graph model for the PostgreSQL DBMS and 
the in-memory implementation have been finished. An implementation for the 
FastObjects OODBMS is under construction. A transaction management ser- 
vice based on the OpenORB transaction manager as well as the rule engine for 
triggering and handling events have been implemented. The two finished imple- 
mentations of the graph model utilize these services and are fully functional, 
tested with the help of unit test cases [2]. To compare the different realizations 
we just begun the implementation of a benchmarking application. 

The implementation of a PROGRES graph model has been half done, the im- 
plementation of the DiaGen and DiaPlan graph model has just started. After 
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finishing the implementation of the PROGRES graph model, we will extend the 
PROGRES environment to generate code from PROGRES specifications for our 
database management system instead of GRAS. In combination with the imple- 
mentation of a new base layer for the UPGRADE framework, it will be possible 
to use Gras/GXL for PROGRES prototypes. Hereafter, we will focus on the 
development of a graph query language and how to utilize it for graph matching 
and defining views on graphs. In addition, we will examine how Gras/GXL can 
be used as the foundation of an engine for executing graph transformations 
exchanged as GTXL [22] documents. Another open issue is the generation of 
mappings based on UML class diagramms and OCL constraints. 
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Abstract. This paper describes how the Fujaba CASE tool supports 
a semi-automatic transformation of usecase scenarios specified by so 
called story boards into automatic test specifications and test imple- 
mentations. A story board is a sequence of graph snapshots showing the 
evolution of a graph based object structure during a typical example ex- 
ecution of an usecase. From such an example execution we automatically 
derive a test specihcation that executes the following three basic steps: 
First, a graph transformation is generated that creates an object struc- 
ture serving as the test bed for the following steps. Second, we generate 
an operation that invokes the core method realizing the corresponding 
usecase. Third, we generate a graph test with a left-hand side corre- 
sponding to the graph structure described as result in the story board. 
On test execution, this graph test validates whether the object struc- 
ture resulting from the usecase execution matches the results modeled in 
the usecase scenario. Support for this approach has been implemented 
within the Fujaba case tool. The approach has been validated in a major 
research project and in several student projects. 



1 Introduction 

Many modern software development approaches propose a so-called usecase 
driven process, e.g. the Rational Unified Process RUP, [4]. In these approaches, 
requirements are analyzed using usecase diagrams and textual scenario descrip- 
tions. During the analysis phase these textual scenario descriptions are refined 
using UML behavior diagrams like sequence diagrams or collaboration diagrams. 
In the design phase, the program structure is defined using e.g. class diagrams 
and the program behavior may be modelled using e.g. statecharts or in our case 
using graph grammar specifications. During these steps and during ongoing sys- 
tem maintenance, it is a major problem to ensure that the program behavior 
matches the behavior outlined in the usecase scenarios. 

There are two solutions to this behavioral consistency problem. First, there 
are a number of approaches that generate program behavior from a number of ex- 
ample scenarios. This works especially for sequence diagrams and statecharts, cf. 
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[10, 11, 8, 5, 7]. Following such an approach guarantees consistency between sce- 
narios and program behavior. However, these approaches are not yet very mature 
and they require a large number of very elaborated scenarios in order to work 
well. In addition, there are still unsolved problems if the generated statecharts 
are further modified during design and maintenance. 

In this work we propose an alternative idea that we have implemented as 
part of the Fujaba case tool project, cf. [3, 6]. We turn example scenarios into 
test cases. From each usecase scenario we semi-automatically derive a JUnit test 
operation. This test operation checks whether given the example situations mod- 
elled by the corresponding usecase description, the program behave as described 
with respect to some observable results. Thereby, we have a simple means to 
do some plausibility testing for the consistency of usecase scenarios and use- 
case implementation. To be able to achieve this, our approach relies on scenario 
specifications using sequences of UML collaboration diagrams, i.e. graph snap- 
shots, for the analysis of usecases. Then we turn certain graph snapshots into 
graph transformations used by our test operations. Since graph transformations 
are conceptually just pairs of graphs, this derivation of graph transformations 
merely requires the copying of certain parts of the scenario descriptions. 

The following chapter introduces our running example. This is followed by 
a description of our software development process. Section 4 outlines our ap- 
proach for test case generation. We close with some experiences and some future 
work. 

2 Running Example 

As running example for this paper, we use the ISILEIT project funded by the 
German Research Society (DFG Schwerpunktprogramm) at University of Pader- 
born, cf. [9]. Figure 1 shows a simplified schematic view of such a transportation 
system. This example employs a number of shuttles that transport goods and 
material between different robots and assembly lines. Traditionally, such sys- 
tems are controlled by a single programmable logic device (PLD). However, to 
become more flexible and to be able to scale to an arbitrary number of shuttles, 
we proposed to model the different shuttles as autonomous agents. Each such 
agent is in charge of a specific transportation task. Different agents may execute 
different tasks at the same time. We modelled the behavior of each agent using 
our Fujaba environment and we were able to provide a simulation environment 
allowing to test the interaction between the different agents. 



3 The Fujaba Development Process 

The FUjaba development Process FUP extends the ideas of [6] by a more elab- 
orated process and by the explicit derivation of tests from scenarios. The FUP 
is an iterative process starting with requirements elicitation based on usecases. 
For each usecase FUP requires at least one textual scenario description with 
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Fig. 1 . The ISILLEIT automatic transportation system 



a predefined structure. To allow this, we extended the Fujaba case tool with an 
HTML based text editor with embedded editing of UML diagrams, see Figure 2. 

Figure 2 shows a usecase diagram for our transportation system and the 
textual description of a standard scenario for usecase load shuttle. Note, in 
FUP each textual usecase description has a description of the start situation, 
a description of the invocation of that usecase, a number of steps outlining the 
execution of the usecase and a description of the result situation. 

In the next step, a special command derives a so-called story board from the 
textual scenario description. Initially, this story board is just an activity diagram 
with one activity for each element of the textual scenario. These activities contain 
the original textual descriptions as a comment. Now the developer models each 
step by a collaboration diagram that is embedded in the corresponding story 
board activity, cf. Figure 3. We call this phase story boarding. 

The first activity of Figure 3 models the start situation of the load shuttle 
scenario with a shuttle in front of an assembly line that owns a good of type 
key. The developer modelled this situation as an object diagram/as a graph 
consisting of a shuttle object s and an assembly robot object ar that are located 
at the same field f . In addition, there is a good object g attached to ar. 

The second activity shows a collaboration diagram modelling the invocation 
of the corresponding usecase. In FUP, each usecase is finally realized by a method 
of some object. Thus, the second activity of the story board just shows one 
collaboration message naming the operation that corresponds to the usecase. In 
our case this is the operation loadShuttle. 

The following activities correspond to the scenario execution steps. They 
are modelled by collaboration diagrams/graph transformations that typically 
show a method invocation and the object structure modifications caused by this 
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Fig. 2. Usecase diagram example 



operation. For example, in our simulation, the openClamps operation in the third 
activity causes an assignment to the clamps attribute of the shuttle object s. 

Note, in Fujaba the left-hand and right-hand side of a graph transformation 
are shown as a single graph with <C destroy^ and -C created markers identifying 
elements that are only contained in the left-hand or only in the right-hand side, 
respectively. See for example how the holds link is replaced by a carries link 
in activity 5 in order to model that the good is loaded onto the shuttle. 

In our approach, the last activity of a story board always models an object 
diagram/a graph representing the result of the usecase execution with respect to 
the corresponding start situation. Thus, the object diagram of the start situation 
may also be interpreted as one possible pre-condition for the execution of the 
usecase and the object diagram of the corresponding result situation may be 
interpreted as the post-condition for this scenario that have to be ensured by 
the implementation of the operation that realizes the usecase functionality. This 
is the operation that is called within the second story board activity. 
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Scenario_load_shuttle_usual_case 

• 






//An empty shuttle s is in front of an assembly robot ar which has created a key. 




JT 



//The shuttle issues a fetch operation on itself 



//The shuttle opens its transportation clamps 




V 









} 



Fig. 3. Story board example 
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During story boarding all used elements like objects, links, attributes and 
methods have to be provided with appropriate declarations in an accompanying 
class diagram. This already ensures a consistent use of object kinds, attributes, 
links and methods throughout all scenarios and even within the following design 
phase. However, this does not yet cover consistency at the behavioral level. 

When a scenario has been modelled by a story board, Fujaba provides a com- 
mand to turn it automatically in a simple JUnit test specification for this sce- 
nario. Basically, we generate a method with three major parts. The first part is 
derived from the modelled start situation of the story board. The generated test 
operation just creates a similar object structure at runtime. The second part is 
the invocation of the operation to be tested. The third part is derived from the 
last activity of the story board that is supposed to model the object structure 
that results from the scenario execution. We turn this into an operation/a graph 
test that compares the object structure resulting from the test execution with 
the object structure modelled as the result of the scenario. The test specification 
derived from the story board in Figure 3 is shown in Figure 4. 

At this point, we are already able to generate code for the JUnit test and to 
run it. The result is shown in Figure 5. For now the JUnit test should fail since 
the implementation has not yet been done. 

Now the developer “just” has to design and implement the methods em- 
ployed within the scenario such that the modelled behavior is achieved. Some 
methodological help for this step is proposed in [2]. Note, the FUP has been 
inspired by the Test-First- Principle of eXtreme Programming [1], However, we 
do not agree, that everything is done as soon as the scenario tests are passed. 
During the design and implementation phase and later on during maintenance, 
the developer may at any time check whether his/her implementation already 
or still matches the corresponding usecase descriptions. Due to our experiences 
this simple approach already helps a lot to keep the scenarios, created during the 
analysis phase, and the actual system design and implementation in a consistent 
state. Note, we do not claim to provide a thorough approach for system testing. 
Our focus is just on keeping the system documentation provided by the analysis 
scenarios up-to-date. 



4 Generating JUnit Testcases 

As already mentioned, the basis for the our test generation approach is that 
the usecase scenarios are modelled using a combination of UML activity and 
collaboration diagrams, so called story boards. To facilitate the test generation, 
these story boards must have a certain structure. They always have to start with 
a model of the Start situation , i.e. with a graph representing the initial object 
structure. This describes the precondition for the story board and is followed by 
the Invocation step which is a collaboration diagram containing an invocation 
of the method that implements the functionality of the corresponding usecase. 
The last step of the story board is the Result Situation showing the final object 
structure. The Result Situation is the postcondition of the story board. This 
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Fig. 4. The generated test 



specific story board structure is supported by our tool as every story board is 
derived from a textual description which already has this structure. 

Creating tests is now simply copying graphs into graph transformations and 
generating code from these graph transformations. To create a test case specifi- 
cation, the following steps are automatically executed by the Fujaba tool: 

1. Fujaba creates a JUnit test class for each usecase and adds a test method 
for each scenario of this usecase to this class. In addition, this new JUnit 
test class is added to a JUnit test suite for the whole usecase diagram. In 
the example, a class Scenario-Use-CaseJ,oadshuttlejnormalTest is created 
containing a method testScenario-Use-Casedoadshuttlejnormal. 

Note, we have chosen the JUnit test framework just because it is simple and 
popular. Our approach does not depend on this framework. 
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Fig. 5. Example test run 



2. Fujaba copies the graph of the start situation to the right-hand side of the 
first graph transformation of the test method. 

Note, in Fujaba the left-hand side and the right-hand side of a graph trans- 
formation are shown as a single graph, where elements that are only part of 
the left-hand side are marked by a -Cdestroy label and elements that are 
only part of the right-hand side are marked using a <§C created label. Thus, 
actually, we copy the graph of the start situation to the first activity of the 
test method and then we add created labels to all elements of the graph, 
cf. the first activity of Figure 3 and Figure 4. 

As a result, on test execution, the first graph transformation of the test 
method creates exactly the object structure, that has been modelled as the 
starting point of the corresponding usecase scenario. 

3. Fujaba copies the graph of the Invocation to the left-hand and to the right- 
hand side of the second graph transformation of the test method. 

Actually, in the Fujaba notation, we need only one copy of the graph without 
•C destroy^ and <Ccreate^> labels, cf. the second activity of Figure 4. At 
execution time, the resulting graph transformation will not modify the given 
graph, but it just re-computes the match between graph transformation and 
host graph. In Fujaba, this matching can be facilitated by using so-called 
bound objects. Bound objects are rendered as boxes containing just the 
object name but not the object type. For the Java code generator of the 
Fujaba environment, a bound object indicates that for this object the match 
of a previous graph transformation step shall be reused. We do not need to 
compute the match again. Thus, we copy the graph of the invocation activity 
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to the test method and we mark all objects as bound in order to facilitate 
the test execution. 

Note, as already discussed, we assume that the start situation of a story 
board models the whole object structure that is a pre-condition for the 
considered scenario. In addition, the invocation step shall just show the 
method-calls that trigger the execution of the scenario but it should not 
introduce new graph elements that are not already mentioned in the start 
situation. Thus, the object structure employed in the invocation step should 
be a subgraph of the object structure modelled in the start situation. During 
test generation this restriction allows us to mark all objects of the invoca- 
tion activity as bound which facilitates the test execution. If the invocation 
step would introduce new graph elements, the invocation activity of the test 
method would have to create these elements. This means, we would have 
to identify such newly introduced graph elements and to mark them with 
-C created labels. This is not yet implemented. 

As already mentioned, the invocation step of the story board should contain 
a method invocation triggering the execution of the corresponding usecase. 
This method invocation is copied from the story board to the test method, 
too. In Fujaba, such a method invocation is shown as a collaboration mes- 
sage. Fujaba graph transformations execute such a collaboration message 
after the execution of the rewrite step. Thus, at test execution time, the 
second activity of the test method just invokes the method that should im- 
plement the corresponding usecase. In our example, method loadShuttle is 
issued on the shuttle object. 

4. Fujaba copies the graph of the Result Situation to the left- and right-hand 
side of the third graph rewrite rule. As already discussed, in Fujaba this 
requires just a plain copy. In addition, we mark all objects of the result 
situation that are already part of the start situation as bound in order to 
facilitate the test execution. In our example, all objects of the result situation 
are already known from the start situation, thus all objects of the third 
acitivty are marked as bound. 

At test execution time, the resulting graph transformation compares the 
object structure resulting from the method invocations of the previous step 
with the object structure modelled as result situation in the original story 
board. 

Note, although our example employs only bound objects, i.e. for all objects 
we reuse the match stemming from the start situation, our result situa- 
tion differs from our start situation with respect to the holds and carries 
link. Thus, in our example the final test activity checks, if the invocation of 
method loadShuttle in the previous step actually achieved that object s 
has a carries link to the object g and if its kind attribute has the value 
“key” , cf. Figure 4) . 

5. Fujaba creates two outgoing transitions from the result activity of the test 
method that signal the success or failure of the test. 
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If the graph test of the result activity is successfully applied we provide a 
[success] transition leading to a stop activity. At test execution time, this 
will just terminate the test and JUnit will flag the test as passed, cf. Figure 4. 
If the graph test of the result activity does not match, we provide 
a [failure] transition leading to a so-called statement activity that ex- 
ecutes the JUnit command failO and then we terminate the method. At 
test execution time, this causes JUnit to flag that the test has not been 
passed. 

6. Fujaba generates code for the test class. 

The Fujaba code generator is able to translate such (test-) method specifica- 
tions into usual Java code. This usual Java code may be employed together 
with manually coded program parts, it may utilize existing libraries and it 
may be embedded into existing frameworks like i.e. the JUnit framework. 

7. Fujaba compiles the code with a standard Java compiler and the JUnit 
library. 

8. Fujaba runs the test within the JUnit Framework. Figure 5 shows that the 
test execution has failed as in our example the loadShuttle () method has 
not yet been implemented. 

9. If the test fails, Fujaba debugs the test execution and uses our Dynamic Ob- 
ject Browsing System DOBS in order to visualize the result graph stemming 
from the test execution and to compare it with the result graph modelled in 
the story board/test specification, cf. Figure 6. 

To summarize, we derive a JUnit test method from a story board modelling 
some example execution of some usecase, that performs three major steps. First, 
the object structure is created that models the start situation of the considered 
scenario. Second, we invoke the method that implements the usecase. Third, 
we compare the object structure resulting from the method invocation with the 
object structure modelled as result situation within the story board. 

Note again, such simple tests may only be used to check whether the imple- 
mentation of a usecase matches the corresponding usecase descriptions. They do 
not provide thorough system tests. We have no experience whether this approach 
can be extended for the purpose of system testing or not. 

5 Conclusions and Future Work 

Our approach of generating test methods out of the scenarios has been used 
with good success in several student projects with altogether about 60 students. 
Within these projects we have made the following observations. 

Since we use scenarios, i.e. story boards, as input for test and code gener- 
ation, our approach requires relatively elaborated scenario diagrams. However, 
we don’t consider this as a flaw but as a feature. Without the subsequent test 
generation, in earlier student projects we observed that frequently the analysis 
scenarios missed important details. In these earlier projects, the scenarios fre- 
quently covered only a small fraction of the important design aspects while many 
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important questions were post-poned to the design and implementation phase. 
Using the test generation mechanisms, in the new projects the students were 
forced to come up with an object structure that covered almost all important 
design aspects of our example applications. We saw much better scenarios and 
analysis documents. 

In addition, we observed, that the students were much more ready to invest 
work into better scenarios since they knew that this work pays back through the 
generated test specifications. 

Note, although we require a level of detail that allows code generation, it 
is still possible to model the scenarios at a very high level of abstraction. The 
scenarios only have to cover the relevant parts of the application domain. They 
still do not need to deal with implementation details. In the student projects, 
such details were typically added during the design and implementation of the 
usecase realization methods. In that phase, the students introduced supporting 
object structure details and they dealt with algorithmic and GUI problems and 
so on. 

All of our students followed the test driven software development approach 
proposed by the Fujaba Process. Thus, they fist generated the JUnit tests and 
they then began to realize the scenarios step-by-step while using the tests as 
executable requirements checks. Due to our impressions, this approach worked 
very well. The student were very focused in their design and implementation 
work and trying to fulfill the test conditions was very motivating for them. We 
had some concerns that the students would come up with very specific imple- 
mentations covering only the cases that were tested. However, this happened 
only a few times when time pressure was very high at the end of the project. 
Most of the time, the student came up with very general methods that covered 
general cases, too, and that would work in all plausible usage scenarios. 

Overall, the desired behavioral consistency between analysis scenarios and 
design and implementation was very well achieved. Due to our experiences, the 
resulting scenarios provided a much better documentation of the implemented 
systems than the documentations that have been produced in earlier projects 
without our test generation approach. 

Besides this encouraging results, we observed a number of improvement pos- 
sibilities. First of all, there is some learning curve. Our students needed some 
time to understand how the start situation, the invocation step and the result 
situation have to be modelled in order to fit for our test generation. Frequently, 
the start situation was not showing all objects and graph elements that were 
required in order to allow the usecase method to work without problems. The 
students tended to introduce some of these objects during subsequent scenario 
steps. In the future we will address this problem more explicitly in our lectures. 
In addition, the Fujaba tool might check subsequent scenario steps for not yet 
introduced graph elements and either mark them with errors or it may even 
try to extend the start situation with such elements automatically during test 
generation. 
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Fig. 6. Object structure in Dobs 



Similarly, we frequently had the problem that the students misused destroy 
and create labels and they made mistakes in using bound objects in the scenario 
steps. Unfortunately, these faults often remained undetected until test genera- 
tion or even until test execution time. Thus, the students were always uncertain 
whether the test failed due to an erroneous implementation or due to such a mis- 
take within the scenario. This problem may be addressed by extending Fujaba 
with a compile time checker for scenarios. 

If the test fails due to an incorrect implementation, most of the time it 
was very difficult for the students to find the cause of the problem. JUnit just 



Transforming Graph Based Scenarios 



73 



reported an AssertionFailedError without further hints, cf. Figure 4. In this 
situation, the students had to use a debugger and our Dynamic Object Browsing 
System DOBS to execute the test stepwise and to inspect the runtime object 
structure and to compare it with the object structure expected by the result 
situation graph test, cf. Figure 6. This debugging was very tedious. To improve 
this, as a first step it should be reported which parts of the result situation 
graph test were successful and which parts did not match and Dobs should 
be started automatically allowing the comparison between expected and actual 
result graph. This might give a hint, which object or link is missing or which 
attribute has a wrong value. So, the problem would be much easier to understand 
than with just the cryptic JUnit exception outputs and debugging would be 
facilitated. 

To complete the implementation it could also be useful to know up to which 
point the test execution fits the modelled scenario and where it starts to differ 
such that the postcondition check finally fails. In our example it could happen, 
that, e.g., in the fifth graph of the story board (Figure 3) the holds link is 
destroyed but the carries link is not created. To help identifying such problems, 
the Fujaba tool should report that the test execution meets the modelled scenario 
until scenario step four and that scenario step five is never reached. This may 
be achieved by two different techniques. First, the developer could add some 
“control points” into the implementation, where at test runtime it is checked if 
the object structure matches a certain scenario step. It should be possible add 
such a mechanisms to Fujaba. Another idea is to derive detailed traces from the 
test runs and to compare these execution traces to the story board. However, 
this approach faces the problem that the test trace probably shows all kinds of 
implementation details while the story board scenario is modelled at the domain 
level of abstraction. We need to find out, how to filter the scenario relevant steps 
from a test trace. Generally, we would like to make more use of the intermediate 
steps of a story board. 
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Abstract. This paper deals with the subject of knowledge-based com- 
puter aided design. A novel method, giving additional support for con- 
ceptual design, is presented. Using this method, a designer first specifies 
the functional requirements and the structure of the object to be de- 
signed, based on use cases and function graphs. A prototype design is 
then derived from these requirements. Subsequently, the designer checks 
the fulfilment of certain consistency rules and engineering norms by the 
application of a constraint checker. This checker uses background knowl- 
edge stored in graph structures and the reasoning mechanism provided by 
the graph rewriting system PROGRES. An example of designing a swim- 
ming pool illustrates the proposed methodology. 



1 Introduction 

Designers, especially architects, very frequently use graph structures to represent 
functional and spatial relations of the object to be designed. Based on this ob- 
servation, a new conceptual design method has been created, in which use cases 
and scenarios, as well as functional requirements and constraints, are specified 
by using graph structures. At present, we restrict our considerations to architec- 
tural design, but it seems to be the case that the developed method could also 
be applied to other engineering domains such as Machine Building or Electrical 
Engineering. The method is presented here for designing a swimming pool, but 
it has already been applied to and is suitable for other types of buildings. The 
design tool GraCAD [25] currently being elaborated, which utilises this method, 
can be seen as a conceptual pre-processor for a new generation of CAD-tools. 
For prototyping the graph part of GraCAD we use the graph-rewriting system 
PROGRES developed at the RWTH Aachen [21], i.e. we use a kind of graph 
transformation-based approach for knowledge representation purposes. 



J.L. Pfaltz, M. Nagl, and B. Bohlen (Eds.): AGTIVE 2003, LNCS 3062, pp. 75—89, 2004. 
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Pioneered by N. Chomsky [8], the linguistic (grammar-based) approach to 
world modelling has been applied in many areas. The core idea in this methodol- 
ogy is to treat certain primitives as letters of an alphabet and to interpret more 
complex objects and assemblies as words or sentences of a language based upon 
the alphabet. Rules governing the generation of words and sentences define the 
grammar of the considered language. In terms of words, modelling such a gram- 
mar generates a class of objects that are considered to be plausible. Thus, gram- 
mars provide a very natural knowledge representation formalism for computer- 
based tools that should aid design. 

Since G. Stiny [22] developed shape grammars, many researchers have shown 
how such grammars allow the architect to capture essential features of a certain 
style of buildings. However, the primitives of shape grammars are purely geomet- 
rical, which restrict their descriptive power. Substantial progress was achieved 
after graph grammars were introduced and developed (cf. e.g. [20]). Graphs are 
capable to encode much more information than linear strings or shapes. Hence, 
their applicability for CAD-systems was immediately appreciated [12]. 

A special form of graph-based representation used for design purposes has 
already been developed by E. Grabska [13] in 1994. Later on, Grabska’s model 
served as the basic knowledge representation scheme in research reported at 
conferences in Stanford [14], Ascona [4], and Wierzba [5]. It turned out that 
by introducing an additional kind of functionality graphs into Grabska’s model, 
conceptual solutions for the designed object can be conveniently reasoned about. 
The additional functionality analysis of houses, as the starting point for the 
conceptual design, has been proposed by several researchers (compare, e.g. [6], 
[9]). Such a methodology allows the designer to detach himself from details and 
to consider more clearly the functionality of the designed object incorporating 
the constraints and requirements to be met, and the possible ways of selecting 
optimum alternatives. 

The results described in sections 2 and 3 can be viewed as the further develop- 
ment of the research reported previously in [23], [24] and [25], i.e. of combining, 
for the first time, graph transformation techniques with a conceptual design ap- 
proach for buildings, based on functionality analysis. As a result of co-operation 
with architects, we were able to apply our method to a more complex, realistic 
example than in previous publications and verify the usefulness of this method. 
According to one architect’s suggestions (2 nd author of this paper), our method 
has been supplemented with a new requirements elicitation phase (cf. Section 2). 
UML activity diagrams [3] are now used to capture the behaviour of prototypical 
users of the designed object, in the form of use cases. Apart from this extension 
of our conceptual design method, the paper studies different ways of replacing 
the previously used ‘procedural’ way of implementing constraint checks, by using 
the new built-in graph constraint checking mechanisms of the language PRO- 
GRES (cf. Section 3). We will see that these new constructs are still difficult to 
use and should be improved in various ways. 
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2 Graph Technology Supporting Conceptual Design 

This section concerns a very early phase of engineering design called conceptual 
design. The main aim of conceptual design is to specify the functional require- 
ments and constraints resulting from the conversation with a customer, and then 
to create a prototype design that fulfils the specified requirements. In this phase 
the designer operates on high-level concepts. Details that would distract him 
from conceptual thinking are omitted. Rapid specifying of consistent functional 
requirements and the creation of a prototype design meeting these requirements 
are very important in the process of designing. Such a prototype facilitates com- 
munication with the customer and is the basis for further discussion about the 
designed object. Such a discussion facilitates the understanding of the customer’s 
intentions, preferences and may result in modifications to the requirements or the 
prototype. After reaching an agreement with the customer, the designer begins 
working out the detailed design. Below, a novel method supporting conceptual 
design activities of architects is presented. 

Architects very frequently use graphs to depict the functional and spatial 
relations of designed objects. Furthermore, they use control flow graphs, similar 
to UML activity diagrams, to show the order of activities performed in the con- 
sidered design object; i.e. — similar to software engineers — architects follow 
a use case driven approach for requirements elicitation purposes (cf. [16], [19]). 
Based on these observations, we have created a method that addresses the con- 
ceptual phase of architectural design, in which the functional requirements and 
constraints for designed buildings are specified in the form of graph structures. In 
this method, UML use case and activity diagrams are integrated with so-called 
area and room graphs, which are then translated into a prototype design (cf. Fig. 
1). One of the main advantages of the graph-based design approach introduced 
thereby is the possibility to specify domain-specific design rules and norms on 
area and room graphs, in the form of constraints on a very high level, and to 
derive the corresponding consistency checking code automatically by using the 
graph transformation system PROGRES. 

In the following, the new design method is explained in more detail using 
the running example of a swimming pool. The architect starts with a func- 
tional analysis of the building, based mainly on conversations with an investor 
(cf. Fig. 1): 

— First, the architect specifies some basic parameters. For the swimming pool 
example these parameters include the type of the swimming pool (recre- 
ational, sports, learner) and an approximate number of users. 

— Then the architect identifies different sorts of users (stakeholders) of the con- 
structed building ( swimming pool client, lifeguard, technical staff, cleaning 
service, administration staff etc.). 

— The following step is to identify use cases for these users: ‘swimming’ for 
swimming pool clients, ‘observing’ and ‘rescuing swimmers’ for lifeguards 
(further considerations are restricted only to the client and the lifeguard). 

— The next step is to define for each use case a scenario in the form of an 
activity graph (UML activity diagram), which explains in more detail the 
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planned interaction between the user and the designed object. These activity 
graphs model the most frequent and important behaviours of the users. By 
creating scenarios in the form of activity graphs for various types of users, 
the functionality of the object is considered from various points of view (from 
the perspective of the client of the swimming pool, the lifeguard — cf. Fig. 
2, etc.). 

— Then the architect has to decompose the designed object into areas. For this 
purpose, the architect creates an area graph in which area nodes and relations 
between them like accessibility, visibility, adjacency are specified. Area nodes 
correspond to functional areas of the designed building. Fig. 3 presents an 
area graph; area nodes are surrounded by dashed lines; edges between area 
nodes are not displayed, to preserve the clarity of our example. 1 

— Afterwards, activity nodes from activity graphs have to be mapped onto 
area nodes. Mapping activities onto a given area node means that these 
activities are performed in the assigned area. This mapping could be skipped 

1 For some types of buildings, especially small or medium ones, like single storey family 
houses, the specification of an area graph could be skipped, for big buildings, like an 
airport the decomposition into areas is recommended. 
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Fig. 2. Activity graphs/ diagrams a) for client of swimming pool b) for lifeguard 



for small and medium size buildings (mapping onto areas is not displayed in 
our example). 

— Next, a room graph has to be specified, i.e. the decomposition of the designed 
object at the level of rooms. Room nodes, relations between rooms, and 
assignment rooms to areas are specified. The relations most frequently used 
by architects at the level of rooms are accessibility, adjacency and visibility, 
but other relations could be introduced as well. 

— Finally, activities from activity graphs are transferred from their assigned 
areas to the corresponding rooms. Mapping an activity onto a given room 
node means that the activity is performed in the room. 

By the way of this method, a functional requirements’ graph of the building, 
consisting of areas, rooms, activities, and edges between them, is created (cf. Fig. 
3). Lower level decomposition elements like changing cabins, cabinets, urinals, 
showers etc. could be introduced into the graph model, but they have been 
excluded from our example. 

Functional requirements defined in this way may be checked with respect to 
quite a number of general or domain specific consistency rules. For instance, the 
following consistency rules have been identified for activities: 

— Check if the order of activities in the activity graph is appropriate: for many 
types of architectural objects it is possible to specify the permitted order of 
activities for the considered subdomain in advance and independent from the 
design process of a specific object of this type. For instance, in a swimming 
pool after showering the next activity might be swimming. 

— Check if for every activity an appropriate room and area is assigned: Start 
and Stop activities are exceptions, which are used to mark beginning and 
ending of a scenario. All other activities are first assigned to specific areas 
and then to the rooms related to these areas. 
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Edges: 

accessibility visibility 

Activities: 

A,L- Entry To Building, B,M - Exit, C - Communicating, D - Using Toilet, E - Ticket Purchase, 
F,N - Undressing, G,0 - Dressing, H - Storing Clothes, I - Clothes Reclaim J - Showering, K - 
Swimming, P - Observing Swimmers, R - Giving First Aid, 

A-K activities for a swimming pool client L-R activities for a lifeguard 



Fig. 3. Swimming pool graph 



— Check whether users are able to perform the activities comfortably in the 
order imposed by the activity graph if the building (at the level of rooms) 
has a structure matching the defined room graph. 



In Section 3, the PROGRES implementation of the graph checker for activity- 
related consistency rules explained above is presented. 

After specifying consistent functional requirements, the architect creates 
a prototype design/floor layout of the building. He/slre operates on high level 
geometrical concepts like zones, rooms and maps them to area nodes and room 
nodes from area/room graphs (cf. Fig. 4 and [25]). Due to this mapping it is 
possible to check whether the design meets functional requirements specified in 
the graph. Based on geometrical elements of the prototype — like walls, doors, 
and windows — the relations between conceptual elements are computed by the 
checker and compared with the structure of the graph of functional requirements. 
An example of such a check is verifying whether room nodes connected by the 
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1) main entrance 2) entrance hall 3) ticket desk 4) changing cabins 5) common 
changing rooms 6) hall 7) showers 8) toilets 9) apparatus 10) first aid room 11) 
lifeguard room 12) indoor swimming pool 

Fig. 4. Prototype design of swimming pool Hallenbad Espelkamp/D, architect: 
A.Wagner(1963), published in [10], p. 132-133 



accessibility edge in the room graph correspond in the prototype design/floor 
layout to rooms that are adjacent and accessible via appropriately placed doors. 

The next issue worth checking is the agreement between the layout proposed 
and the architectural norms. In particular we have to check whether: 

— each area, room, and other design element has adequate spacing 

— the location of each room with regard to geographical considerations is cor- 
rect (for example whether the swimming pool is placed with its longer wall 
towards the South) 

— windows provide rooms with enough light (the minimal window area for 
rooms intended for prolonged occupancy is equal to 1 /8 of the floor area) 

— the layout is optimal for the supply of essential services (water, gas, electric- 
ity) and ventilation, etc. 

Fig. 5 shows standards for the swimming pool design cf. ([10], [19], [26]) that 
could be used in architectural norm checkers. 

After creating the prototype design and checking its consistency with func- 
tional requirements and architectural norms, the prototype is transformed to 
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Elements 


Indicators 


Quantities for 25 x 12,5 m. 
swimming pool 


Pool water area 


- 


312,5m 2 


Surface of pool surrounds 


60 - 70 % of pool water area 


190 -230 m 2 


Number of places for storing 
clothes 


1 place per 1 ,5 m 2 of pool water 
area 


210 places 


Number of places for 
changing clothes 


60% of places for storing clothes 


120 places 


Max. number of users at the 


1 person per 1,5 m 2 pool water 


2 1 0 people (70 people in the 


same time including: 


area 


water, 140 - in pool 
surrounds, changing room, 
cafe, etc.) 


men 


55% 


1 1 5 men 


women 


45% 


95 women 


Number of showers 


1 shower per 5-7 users 


24-17 


Number of toilets for: 






men 


1 toilet per 50 men 


2 toilets 




1 urinal per 30 men 


4 urinals 


women 


1 toilet per 25 women 


4 toilets 



Min. swimming hall length - length of the swimming pool + 2 x 3m (3m. on both sides of the pool) 
Min. swimming hall width - length of the swimming pool + 2 x 2m. (2m. on both sides of the pool) 
Most common lengths for sports swimming pools: 51m, 50m, 25m, 33 Vsm. 

Most common widths for sports swimming pools: 12,5m, 15m: 

Min. width of swimming lane: 2m. 

Most common dimensions for learner pools: 10,0 x 4,5 m, 12,5 x 6,0 m, 12,5 x 8,0 m, 16 2 / 3 x 8 V 3 
Fig. 5. Standards for swimming pools 



the standard CAD environment (ArchiCAD, Architectural Desktop), which is 
more suitable for the detailed, technical design. In this environment the architect 
continues designing but at a much more detailed level than in the conceptual 
phase. 

3 Swimming Pool Checkers — PROGRES Specification 

For prototyping (constraint) checkers for the above mentioned consistency rules 
we now use (in contrast to previously published papers about the related activ- 
ities) the recently introduced mechanism of local constraints and repair actions 
available in the graph rewrite system PROGRES. A local constraint declaration 
introduces an integrity constraint, which is an arbitrary first-order predicate 
logic formula and which is (incrementally) checked at run-time. The constraint- 
defining Boolean expression uses path expressions to navigate from the consid- 
ered node self to related nodes in its neighbourhood. The constraints are, there- 
fore, a kind of derived Boolean attributes, which have to be true at the end of 
so-called safe transactions or productions. (A transaction is a parameterised sub- 
program, which calls an arbitrary number of tests and productions to implement 
the desired graph transformation.) The keyword safe marks those productions 
and transactions which take a consistent graph as input and produce a consistent 
graph as output with respect to all relevant integrity constraints. All produc- 
tions without this prefix may produce intermediate graph states which violate 
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some integrity constraints. The violation of a constraint at the end of safe graph 
transformations causes immediate termination of a running execution process or 
the activation of a so-called repair action. The repair action is a graph trans- 
formation which should be successful and eliminate the detected inconsistency; 
otherwise, the PROGRES execution machinery stops with throwing a final error 
message. 

The following part of this section shows an example of a consistency checker 
specified with local constraints and repair actions of PROGRES. This checker 
verifies whether the order of activities in an activity graph is appropriate and 
whether every activity has an adequate room assigned. The most important 
class in our example is an abstract class Activity that represents the activity 
performed in the designed object. 

node class Activity is a Object 
meta 

fol lowincjAct ivit ies : type in Activity [0:n] : - Activity; 

ac t i v i tyRoom s : t ype in Room [0:n] : - Room; 
constraint 

checkFol lowincjAct ivit ies 

= for all activity : Activity := self . -next-> :: 
activity, type in self , fol lowincjAct ivit ies 
end 

else 

for all act : Activity := sel f ■ -next-> 
do 

choose 

when (not (act . type in self . fol lowingAct ivit ies ) ) 
then 

removeHextEdge ( self , act ) 

else 

skin 

end 

end 

end ; 

check Ac t iv i tyRooms 

= for all room : Room := sel f . -act ivi tyRoom -> :: 
room . type in sel f . act ivi tyRooms 
end ; 

end ; 

Activity is the base class for Stop, Start classes 

node typ e Start : Activity 
constraint 

check IncomingEdges = empty ( sel f . <-next- ); 
redef meta 

activityRooms := nil ; 

end; 

node typ e Stop : Activity 
redef meta 

fol lowingAct ivit ies := nil ; 

activityRooms := nil ; 

end; 

that exist in the specifications for all kinds of buildings, and the base class for 
the building (swimming pool) specific activities like Dressing, Undressing (for 
the sake of clarity we show only these two building specific classes). 
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node type Dressing : Activity 
redef met a 

followingAct ivities : = HairDrying ; 
activityRooms : = 

LifeGuardRoom. or Chang ingCab ins or Common Chang ingRooms ; 

end : 

node type Undressing : Activity 
redef met a 

followingAct ivities := StoringClothes ; 
activityRooms := 

LifeGuardRoom or Chang ingCab ins or Common Chang ingRooms ; 

end : 

Activity contains two meta attributes: followingActivities and activityRooms. (A 
meta attribute is not a node attribute, but a node type attribute. Its value may 
not be changed at run-time.) The first meta attribute is a set of node types 
derived from the type Activity , indicating types of activities that could be linked 
by next edges with the activity of the considered type. The next relation is 
defined between two Activity classes and indicates the order of activities per- 
formed in the designed building. The followingActivities attribute is set in the 
Activity class to the Activity type and could be redefined in Activity subclasses 
(cf. Dressing, Undressing, Stop classes). The second meta attribute is a set of 
node types derived from the type Room, indicating types of rooms that could 
be linked by activityRoom edges with the activity of the considered type. The 
activityRoom relation is defined between Activity and Room classes. If an ac- 
tivity a is linked by an activityRoom edge with a room r, it means that the 
activity a is performed in the room r. The activityRoom attribute is set in the 
Activity class to the Room type and could be redefined in Activity subclasses 
(cf. Dressing, Undressing, Start, Stop classes). These meta attributes are used 
in checkFollowing Activities and check Activity Rooms constraints. The constraint 
checkFollowing Activities verifies whether every activity linked by the next edge 
with the considered activity is of a type included in the followingActivities set. 
The constraint check Activity Rooms verifies whether every room linked by activi- 
tyRoom to the considered activity is of a type included in its own activityRooms 
set. In addition to these two constraints, checklncomingEdges is defined for the 
Stop class. It checks whether the set of incoming next edges is empty for a given 
Stop activity. The constraint checkFollowing Activities contains a user defined re- 
pair action. This action finds all inconsistent next edges and removes them with 
the removeNextEdge transaction. 

The mechanism of constraints and repair actions seems to be very useful for 
the specification of checkers. The main disadvantage of the current PROGRES 
implementation is that it either deactivates constraint checking completely or 
enforces all constraints, but gives you no option: 

1. to activate only selected constraints 

2. to display error messages about constraint violation 

3. to activate only selected repair actions 

The following part of this section shows ad-hoc solutions to these problems. 
For simplicity we restrict our considerations only to the checkFollowingActivities 
constraint. The Activity class has been modified as follows: 
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node class activity is a Object 
intrinsic 

followingAct ivit iesCheckBct ivated : boolean : - true : 
f ol lowingAct ivit iesRepair Act ionAct ivated : boolean := true : 
errorMsg : string : = " " ; 
met a 

followingBct ivities : t ype in activity [0:n] : = Activity; 
activityRooms : t ypo in Room [0:n] : = Room; 
constraint 

newCheckFol lowingAct ivities 

= ( not self.. followingActivitiesCheckActivated) 
or ( (oldCheckFollowingAct ivit ies ( sel f ) 
and f sel f . errorMsg = "") ) 

or (not oldCheckFollowingAct ivit ies ( sel f ) 
and ( sel f . errorMsg # )) 

else 

choose 

when ( self . followingAct ivit iesRepairActionAct ivated 
and not oldCheckFollowingAct ivit ies ( sel f ) ) 

then 

oldRepairAct ion ( sel f ) 

& setErrorMsg ( self . "" ) 

else 

setErrorMsg 
( self . 

"Activity contains not recomended outgoing next edge." ) 

end 

end ; 

end : 

The constraint checkFollowingActivities was replaced in the modified Activity 
class by new CheckFollowingActivities. In the newCheckFollowingActivities con- 
straint the function oldCheckFollowingActivities is invoked. 

function oldCheckFollowingActivities : 

( activity : Activity) -> boolean = 

for all nextActivity : Activity := activity . -next-> :: 
nextActivity. type in activity . followingAct ivit ies 
end 
end ; 

This function returns true if every activity linked by the next edge with the activ- 
ity passed as a parameter has a type included in the activity, following Activities 
set. In other words, oldCheckFollowingActivities does the same as checkFol- 
lowingActivities, but is defined as a function. In the repair action of the 
newCheckFollowingActivities constraint, oldRepairAction is invoked. 

transaction oldRepairAct ion ( activity : Activity) [1:1] = 
for all nextActivity : Activity := activity. -next-> 
do 

choose 

when 

( not (nextActivity. type in activity. followingAct ivit ies) ) 
then 

removetfextEdge ( activity, nextActivity ) 

else 

skip 

end 

end 

end : 

The oldRepairAction PROGRES transaction has the same functionality as the 
repair action for the checkFollowingActivities constraint. In the new Activity 
class, three intrinsic attributes have been added: 



the followingActivitiesCheckActivated Boolean attribute is responsible for 
activation/deactivation of following activities checking for single node in- 
stances. If the value of this attribute is true, checking is activated. 
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— the followingActivitiesRepairActionActivated Boolean attribute is responsi- 
ble for activation/deactivation of a repair action for a considered node in- 
stance. If the value of this attribute is true, inconsistencies are eliminated by 
the execution of oldRepairAction, otherwise inconsistencies are marked only 
by assigning error messages to the errorMsg attribute. 

— the errorMsg string attribute is used for marking inconsistencies with error 
message texts. 

In the newCheckFollowing Activities constraint, first the Boolean condition (not 
self. followingActivitiesCheckActivated) is evaluated. If this condition is true, 
then the evaluation of the considered constraint is finished, otherwise the re- 
maining part ( (oldCheckFollowing Activities ( self ) and (self errorMsg = ””)) or 
(not oldCheckFollowing Activities ( self) and (self errorMsg ff ””))) is executed. 
This remaining part means that one of the conditions below has to be fulfilled: 

— the function oldCheckFollowing Activities returns a true value for the consid- 
ered self activity and errorMsg for the self activity equals the empty string. 

— the function oldCheckFollowingActivities returns a false value for the con- 
sidered self activity and the attribute errorMsg for the self activity is a 
non-empty string. 

A violation of both conditions invokes the repair action. In the 
repair action a guarded choose statement checks the condition 
(self following ActivitiesRepairActionActivated and not oldCheckFollowingAc- 
tivities ( self )) i.e. it is checked if the repair action is activated and 
oldCheckFollowingActivities returns a false value for the considered activity. If 
the condition is fulfilled the inconsistencies are eliminated with the oldRepairAc- 
tion transaction and errorMsg is set to the empty string, otherwise the errorMsg 
attribute is set to the message that describes the error. 

The specification above solves problems 1-3. However, a very low level, dif- 
ficult to understand PROGRES code has to be written. A general PROGRES 
mechanism: 

— allowing activation/deactivation of selected constraints 

— marking inconsistencies by means of appropriate error messages 

— activation/deactivation of a repair actions for selected constraints 

for all nodes or specific nodes of a given node class/type would simplify writing 
PROGRES specifications for checkers considerably. Future versions of PRO- 
GRES or other graph transformation environments with similar features should 
be extended appropriately or refrain totally from giving any language support 
for constraint checking purposes. 

4 Summary 

The research concerning conceptual design, in particular the requirements elic- 
itation phase and the phase of creating the prototype design for the specified 
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requirements, seems to be worth to continue. To the best of our knowledge, the 
CAD tools for architects developed until now ([1], [2]) do not support the elicita- 
tion phase. We are also not aware of the usage of UML activity diagrams for this 
purpose. Based on this observation, the method supporting conceptual design, 
previously presented in [23], [24] and [25], has been supplemented with a new 
requirements elicitation phase. UML activity diagrams are now used to capture 
the behaviour of prototypical users of the designed object in the form of use 
cases. In the method, use cases and scenarios as well as functional requirements 
and constraints for the considered object are specified using graph structures. 
The consistency of the specified requirements is verified by graph checkers. In 
other works concerning the usage of graph rewriting system PROGRES in the 
area of conceptual design of building ([17], [18]), the elicitation phase is skipped 
and the consistency of a designed building is verified based on the parametrizable 
graph knowledge specified by a knowledge engineer. In our case it is checked if 
the object to be designed fulfils the graph requirements specified by designer in 
the elicitation phase. The combination of those two complementary approaches 
will be considered in the future. 

Graph transformation systems like PROGRES appear to be appropriate tools 
for prototyping the software implementing such a method. In our graph speci- 
fication we used the new built-in graph constraint checking mechanisms of the 
PROGRES language for that purpose in order to considerably simplify the con- 
struction of these specifications. But unfortunately, experience so far shows that 
these new constructs are still difficult to use and should be improved in various 
ways. In spite of this fact, PROGRES seems to be useful in communication be- 
tween an architect (a domain expert) and a knowledge engineer, while defining 
and implementing architectural domain knowledge. The specification presented 
in section 3 was created in co-operation with architects and the most of the 
PROGRES statements used in this specification were clear for them. Moreover, 
the architects made an interesting remark that the formalism of hierarchical 
graphs and transformations on them would be useful, because the structure of 
all, or almost all, buildings is hierarchical. 

The PROGRES specification for a swimming pool is still in the process of 
construction. Based on this specification we are going to create a prototype ap- 
plication integrated with an existing CAD tool for architects and verify whether 
the presented method is useful in practice. In our work on this prototype we 
are going to use the commercial tool for architects ArchiCAD 8.0 [1]. An add- 
on extending the functionality of ArchiCAD will be implemented with the use 
of ArchiCAD API Development Kit 4.3 and Microsoft Visual C++ 6.0. It is 
planned to integrate this add-on with a graph editor for specifying the func- 
tional requirement of the building to be designed. The graph editor will be built 
on the basis of PROGRES UPGRADE Framework [7]. The other interesting 
task to consider would be a transformation from an ArchiCAD representation 
into a graph representation; however, such a reverse engineering step seems to be 
quite difficult (or even impossible) because of the lack of semantic information 
about the rooms’ purpose in the standard ArchiCAD model. 
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In some of the design tasks, where engineering rules are strict, it is possible to 
find a solution to the design problem by the means of graph generation strategies. 
An example of such a strategy would be the generation of a furniture layout in an 
office, based on the function of a considered room. The creation of such a strategy 
in co-operation with architects is planned in the future (cf. [14]). 

Finally, we would like to emphasize that the presented method seems to 
be worth considering in other engineering domains like Machine Building or 
Electrical Engineering as well. 
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Abstract. In this paper we discuss how tools for conceptual design in 
civil engineering can be developed using graph transformation specifi- 
cations. These tools consist of three parts: (a) for elaborating specific 
conceptual knowledge (knowledge engineer), (b) for working out concep- 
tual design results (architect), and (c) automatic consistency analyses 
which guarantee that design results are consistent with the underlying 
specific conceptual knowledge. For the realization of such tools we use 
a machinery based on graph transformations. 

In a traditional PROGRES tool specification the conceptual knowledge 
for a class of buildings is hard- wired within the specification. This is 
not appropriate for the experimentation platform approach we present 
in this paper, as objects and relations for conceptual knowledge are due 
to many changes, implied by evaluation of their use and corresponding 
improvements. 

Therefore, we introduce a parametric specification method with the fol- 
lowing characteristics: (1) The underlying specific knowledge for a class 
of buildings is not fixed. Instead, it is built up as a data base by using 
the knowledge tools. (2) The specification for the architect tools also 
does not incorporate specific conceptual knowledge. (3) An incremental 
checker guarantees whether a design result is consistent with the current 
state of the underlying conceptual knowledge (data base). 



1 Introduction 

In our group, various tools for supporting development processes have been built 
in the past, for software engineering [14], mechanical engineering, chemical en- 
gineering, process control [10], telecommunication systems [13], and authoring 
support [7], some of them are presented at this workshop. This paper reports 
about a rather new application domain, namely civil engineering. 

For all tools mentioned above, we use a graph-based tool construction proce- 
dure: internal data structures of tools are modeled as graphs, changes due to com- 
mand invocations are specified by graph rewriting systems. Then, there are two 
different branches for constructing tools, a research-oriented and an industry- 
oriented one. In this paper we restrict ourself to the research- oriented branch. 

* Work supported by Deutsche Forschungsgemeinschaft (NA 134/9-1) 



J.L. Pfaltz, M. Nagl, and B. Bohlen (Eds.): AGTIVE 2003, LNCS 3062, pp. 90—105, 2004. 
(c) Springer- Verlag Berlin Heidelberg 2004 



Parameterized Specification of Conceptual Design Tools 



91 



semantics of the domain 




Fig. 1 . Different areas of conceptual design 



There, we derive tools automatically from specifications, using the PROGRES 
system [16] for specification development, a code generator for producing code 
out of the specification, and the UPGRADE visual framework environment [2] 
into which the code is embedded. The resulting tools are efficient demonstrators 
for proof of concept purposes. Theses tools are based on our academic develop- 
ment infrastructure, having been developed in the last 15 years. 

Conceptual Design in civil engineering means that design results are elabo- 
rated on a coarse and abstract level without regarding details which are later in- 
cluded in constructive design (in other disciplines called detail engineering) [12]. 
The main goal of conceptual design is to take the various levels of semantics for 
a design problem into consideration(cf Fig. 1): (a) domain specific knowledge, as 
standards, economy rules, security constrains, or common and accepted design 
rules, (b) experience knowledge in form of best practice or of using previous 
design results and, finally, (c) specific user behavior knowledge or wishes, where 
users are customers or architects, respectively. 

The essentials of our conceptual design approach are that (i) explicit knowl- 
edge can be formulated, enhanced, or used, (ii) change support is specifically 
supported, where changes can happen on the level of knowledge as well as for 
design results, (iii) a lot of consistency checks are included in order to report 
errors as soon as possible, and (iv) a smooth connection to constructive design 
is aimed at. The approach specifically pays off, if (v) specific classes of buildings 
are regarded and, within a class, different designs for buildings and different 
variants thereof. 

We realize a graph-based demonstrator by which a senior architect (knowl- 
edge engineer) can specify knowledge by tools. The knowledge is specific for 
a class of buildings. For the usual architect, there are further tools for developing 
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conceptual designs. These designs are immediately checked against the under- 
lying specific knowledge. For the realization of these tools, we use the enhanced 
machinery already sketched above. We call this demonstrator the conceptual 
knowledge experimentation platform as it allows to experiment with concepts 
without being forced to change the realization of tools. 

In this paper we take a certain class of buildings as an example namely 
one-floor medium-size office buildings. The example is simplified with respect 
to breadth and depth. The paper also gives no details of the implementation 
of tools, only the graph transformation specifications are presented here. Tool 
functionalities and user interface style of the experimentation platform are given 
in a separate demo description [1 ]. 

The paper goes as follows: In section 2 we give a specification of architect tools 
in a traditional form, were the building type specific knowledge is fixed within 
the specification. This motivates the different specification method presented in 
this paper. In section 3 we discuss the specification for the knowledge engineer 
tools. Furthermore, we give an example of a host graph which can be produced by 
interactively using these tools. This graph, called domain model graph describes 
the characteristics of a class of buildings (here office-buildings). Section 4 gives 
a specification of the architect’s tools by which conceptual designs for buildings 
can be elaborated. In section 5 we discuss the specification for analyses. Section 6 
emphasizes the difference of the two specification methods, the traditional and 
the parameterized one, summarizes the main ideas of this paper, and discusses 
related literature. 



2 A Traditional Tool Specification 

There are many projects in the group using graph technology. The specific knowl- 
edge of the appropriate domain usually is hard-wired in the schema and the 
transaction part of a PROGRES tool specification. In this paper, we apply a dif- 
ferent specification method. 

The reason is, that the knowledge engineer will not be able to learn the 
PROGRES language and to use the realization machinery, adequate for a tool 
builder. Furthermore, the knowledge should be easily modifiable, as we are ex- 
perimenting to find suitable object and relation types, restrictions, and rules for 
conceptual design in civil engineering. 

To illustrate the difference between a traditional specification and the param- 
eterized one described here, we briefly introduce an example specification which 
shows how tools for architectural design of an office building would be described 
in the traditional way. 

The schema part of our example is shown in Fig. 2. It shows the abstract 
node class ROOM with a comment attribute. Nodes of that class can be related 
to each other by Access and Contains edges. The node class is specialized into 
five different node types representing different room types we want to model. 
Therefore, the relations can connect rooms of all specific types. The node class 
ROOM evidently expresses the similarities of different room node types. 
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Fig. 2. Schema of an office building 



production Create_EntranceHall( out newHall : Entrance_Hall [1:1]) 




return newHall : = 3 ’ ; 
end ; 

Fig. 3. Example of a graph production, inserting a node and an edge 



The transaction part determines how different graphs of that graph class are 
built. Fig. 3 shows a sample production for our example. The graph pattern to 
be searched requires an Outside node to exist and no Entrance_Hall node to be 
already present (negative application condition). If this pattern is found when 
applying the production, a new Entrance_Hall node is created and connected 
with the outside node by an Access edge. So, the application of the production 
guarantees that the Entrance_Hall is always directly accessible from outside. 

Thus, each graph of our example specification models the structure of an 
office building floor plan. A ROOM node without an Access relation stands for 
an inaccessible room. The test shown in Fig. 4 finds such rooms. It searches the 
graph for ROOM nodes which are not connected with the outside node by a path 
containing Access or Contains relations. The result is a possibly empty node 
set. In this way we formally define the meaning of inaccessibility. 

We see that the knowledge about the building type ’’office building” is fixed 
within the specification. There are room types Entrance_Hall or 2Person0f f ice 
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test InaccessibleRooms ( out inaccessibleRooms : ROOM [0:n]) = 



' 1 : Outside 






2 : ROOM 



( -Access-> 

or -Contains-> ) 



return inaccessibleRooms := ‘2; 
end: 



Fig. 4. Example of a graph test, finding inaccessible rooms 



room defined as node types. Evidently, it is not possible to create new room 
types, like coffee kitchen with certain accessibility constrains, without changing 
the PROGRES specification (schema, transactions, tests). Using PROGRES this 
way means that the knowledge engineer and the specificator (and later on the 
visual tool builder) are the same person. 

Our request is to keep these jobs separate. The specificator develops a gen- 
eral specification, which does not contain specific application knowledge. This 
knowledge is put in and modified by the knowledge engineer, as a database, 
here called domain model graph where he is using tools derived from the gen- 
eral specification. In the same way, there is an unspecific specification for the 
architect tools, from which general tools are derived. The architect tools now 
use the knowledge domain model graph interactively elaborated by the knowl- 
edge engineer. Thereby, the design results are incrementally checked against the 
underlying specific knowledge. 

It is obvious, that this approach has severe implications on how a specification 
is written, where the domain knowledge is to be found, and where it is used. The 
different approaches to fix domain knowledge in the specification or to elaborate 
it in a host graph are not PROGRES specific, they are different ways to specify. 

3 Specification of the Knowledge Engineer Tools 

In this section we describe the specification for the knowledge engineer tools. 
Using the corresponding tools, specific domain knowledge for a class of buildings 
is explicitly worked out (domain model graph) . This knowledge is used to restrict 
the architecture tools to be explained in the next section. 

The upper box of Fig. 5 depicts the PROGRES schema part of the knowledge 
engineer specification. This schema is still hard-wired. It, however, contains only 
general determinations about conceptual design in civil engineering. Therefore, 
it is not specific for a certain type of building. 

The node class m_Element (m_ stands for model) serves as root of the class 
hierarchy, three node classes inherit from it. The class m_AreaType describes 
’’areas” in conceptual design. Usually, an area is a room. It may, however, be 
a part of a room (a big office may be composed of personal office areas) or 
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Fig. 5. Schema of knowledge engineering tool and specific model 



a grouping of rooms (a chief officer area may contain a secretariat, a personal 
office for the chief officer, and a meeting room). 

From the class m_AreaType two node types are defined; m_AtomicAreaType 
represents an area not further decomposed, m_ComplexAreaType a complex area 
composed of several areas. In the same way, classes m_Obligatory and m_Forbid- 
den describe obligatory and forbidden relations. As we model knowledge about 
a building type, optional relations are not modeled explicitly. Everything what 
is not forbidden or obligatory is implicitly optional. The reader may note that 
attributed relations are represented in PROGRES as nodes with adjacent edges. 
Finally, attributes may appear as constituents of areas and relations. So, we have 
again node classes and types to represent the attributes, here only for integer 
and Boolean values. Note again that attributes have to be defined as nodes, as 
they are defined by the knowledge engineer. 

Fig. 5 in the lower box shows some nodes which stand for kinds of concepts to 
be used for our office building example. We call these kinds models. These mod- 
els appear in the host graph, interactively produced by the knowledge engineer 
by using the tools the specification of which we regard. The 2PersonOfficeModel 
node represents a corresponding room kind, the AccessRelation node an acces- 
sibility relation between rooms, the ElectricityAttribute an attribute node 
needed to describe a property of an office room. These nodes are schematic in- 
formation (information on type level) for the class of buildings to be described. 
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Domain Model Graph 




Fig. 6. Cutout of a Domain Model Graph (specific for a type of buildings) 



As, however, this type info is not static but interactively elaborated, we call it 
model. 

An attribute, such as for electric access, may appear in several room mod- 
els. So, it has to be defined before being used several times. In the same way, 
accessibility occurs between different room models. Furthermore, it is up to the 
knowledge engineer, which attribute and relation concepts he is going to intro- 
duce. By the way, these definitions may be useful for several types of buildings. 
Therefore, there is a basic model definition layer in the middle of Fig. 5. 

Summing up, Fig. 5 introduces a 3 level approach for introducing knowledge. 
The PROGRES schema types are statically defined. They represent hard-wired 
foundational concepts of civil engineering. The other levels depend on the knowl- 
edge tool user. Thereby, the middle layer defines basics to be used in the specific 
knowledge which is dependent on the type of building. So, the static layer on 
top defines invariant or multiply usable knowledge, whereas the middle layer and 
the bottom layer are specific for a type of building. The host graph built up by 
knowledge engineer tools contains information belonging to the middle and the 
bottom layer. 

Fig. 6 shows a cutout of this graph structure the knowledge engineer develops, 
which we call domain model graph. On the left side, basic attribute and relation 
models are depicted. They belong to the level 2 of Fig. 5. On the right side their 
use in a specific domain model is shown. This right side shows the area models 
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production m_CreateBoolAreaAttribute( attributeModelDescr : string ; 
attributeValueDescr : boolean; 
areaModel : m_AtomicAreaType) 

[0:1] = 



' 1 = areaModel 



' 2 : m_boolAreaAttributeType 




condition ‘ 2 . attributeDecr = attributeModelDescr; 
transfer 3 attributeDecr := attributeModelDescr; 

3 ’ . attributeValueDef inition := attributeValueDescr; 



end ; 



Fig. 7. Creating an instance of an attribute model 



2Person0ff iceModel and CorridorModel. The 2Person0ff iceModel has two 
attributes to demand network sockets and electricity to be available. Between the 
two area models, an access relation is established, to demand an access from all 
2 person offices to the corridor, in the graph realized through an edge-node-edge 
construct. 

In Fig. 5 we have introduced a three level ’’type” system. Any lower level 
is an instance of an upper level. A node, however, can be an instance of the 
static type and a dynamically introduced basic type as well. We can see that 
the electricity attribute is an instance of the static type m_boolAreaAttribute 
and of the dynamic basic type ElectricityAttributeModel. This is realized 
by giving the electricity attribute node a string attribute denoting the dynamic 
basic type and an edge to_instance from the basic type to the attribute 
node. Tests and transactions guarantee the consistency between these static or 
dynamic types of instances. 

Fig. 7 shows a production to create an attribute assigned e.g. to a 2Person- 
Dff iceModel. Please note that the model is represented by a node with the 
denotation areaModel of the static type m_AtomicAreaType which has a PRO- 
GRES node attribute storing the dynamic type 2Person0ff iceModel. Input 
parameters are the attribute model description as a string, an attribute value, 
and the model node representing the 2 person room concept. Node ‘2 on the 
left side represents an attribute model node. By the condition clause we en- 
sure that it corresponds to the input parameter attributeModelDescr. Only 
if an attribute model (node ‘2) with this description exists, a new attribute 
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(node 3') is created and linked to the 2Person0ff iceModel (node 1’) and to 
the attribute model (node 2’). The model description is stored in a string at- 
tribute of node 3 ’ , just as the attribute value. The inverse operation, to delete an 
attribute is trivial. Before deleting an attribute model all corresponding instances 
have to be deleted. This is done by a transaction executing several productions 
in a specific order. 

Interactive development by the knowledge engineer means that transactions 
modifying the domain model graph are now invoked from outside. Then, this 
domain model graph is built up containing two levels as shown in Fig. 6. Thereby, 
the corresponding to_instances, to_attribute, to_Relation, and from_Rela- 
tion edges are inserted. Any new concept is represented by a node of a static 
type (to be handled within the PROGRES system), of a dynamic type, with 
bordering nodes for the corresponding attributes which belong to predefined 
attributes of the basic model definition layer. 

4 Specification for Architect Tools 

Whereas the domain model graph is used to store conceptual knowledge, the 
design graph provides a data structure to represent the conceptual design of 
a building. The specification of the designer tools directly uses the runtime- 
dependent basic domain knowledge (layer 2 of Fig. 5). So, the consistency of 
a design graph with this basic knowledge can be obeyed. The consistency of the 
design graph with the building type specific knowledge of layer 3 is guaranteed 
by other analyses. Both analyses are described in the next section. 

The design graph allows to specify the structure and the requirements of 
a building in an early design phase, above called conceptual design. To design 
a building without any layout and material aspects allows the architect to con- 
centrate on the usage of this building on a high abstraction level. During the 
constructive design, this design can be matched with an actual floor plan to 
discover design errors. This is not further addressed in this paper. 

The design graph again is the result of the execution of a PROGRES speci- 
fication, where transactions are interactively chosen. The 3 level ’’type” system, 
which is similar to that of Fig. 5, is shown in Fig. 8. The essential difference 
is that we now model concrete objects , relations , both with corresponding at- 
tributes and not knowledge describing how such a design situation has to look 
like. This is denoted by the prefix d_, which stands for classes and types for 
design. 

Another difference is the d_Notif ication node class with three correspond- 
ing node types. The nodes of these types are used to represent warnings, errors, 
and tips to be shown to the architect. Furthermore, there are now concrete re- 
lation nodes between design objects and not rules that certain relations have 
to exist or may not exist. Finally, the design graph nodes now are instances, 
and not nodes describing types for instances as it was the case on layer 3 of the 
knowledge engineer tools. 
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Fig. 8. Scheme of the design graph 



The instantiation of attributes, areas, and relations works in the same way 
as described in Fig. 7 for models. In the design graph we find instances of con- 
cepts with a static and dynamic type with bordering instances of attributes and 
relations both being applied occourences of the corresponding basic models in- 
troduced on layer 2 of Fig. 5. As this basic model layer is again needed on the 
design graph level we just import it from the domain model graph. 



5 Consistency Analyses 

In this section we present two different forms of consistency analyses. The first 
form is part of the domain model graph specification. So, these analyses are 
executed when the knowledge engineer tool is running, to keep the dynamic type 
system consistent. Corresponding internal analyses can be found for the design 
graph, respectively. The second form of analyses shows how the consistency 
between the domain model graph and the design graph is checked. 

Let us start with the first form of analyses built in the domain model graph 
specification. Fig. 9 shows a test being part of the analyses to guarantee the 
consistency of the dynamic type system. Each basic model has to be unique. 
So, if the knowledge engineer tries to create a model that already exists, the 
enclosing transaction should fail. The test mJtttributeModelExists gets as in- 
put parameter the model description, e.g. ElectricityAttributeModel. If the 
model already exists, then a node of type m_boolAreaAttributeType exists, 
whose attribute attributeDescr has the value of the input parameter. 

These analysis transactions work as usual in PROGRES specifications. They 
guarantee that certain structural properties of a graph class (here domain model 
graph) are fulfilled. In the above example this means that a basic model defi- 
nition occurs only once. The difference to traditional PROGRES specifications, 
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test m_AttributeModelExists ( modelDescr : string ) 
[0:1] = 



' 1 : m_boolAreaAttributeType 




valid (self .attributeDescr = modelDescr) 



end ; 

Fig. 9. Test if a model already exists 



however, is that the corresponding node type is dynamic. So we have to check 
the values of runtime-dependent attributes. 

Corresponding internal analyses we also find on design graph level, for the 
consistency between predefined basic knowledge (imported from the domain 
knowledge graph) and the current form of the design graph. As they work in 
the same way as the internal analyses of the domain model graph, we skip them. 

The second form of analyses check whether there are violations of the pre- 
defined specific knowledge within the design graph. For this, we have to find out 
inconsistencies between the design graph and the domain model part of domain 
model graph (cf. Fig. 6). The attributes of an area model prescribe the usage of 
an area in the design graph. In an office block, there should be network sockets 
in all offices, but not in the corridor. This rule is defined in the domain model 
graph by the Boolean attribute NetworkAttribute whose value can be true or 
false. If the architect constructs a network socket in the corridor, by connecting 
the area Corridor with the attribute NetworkAttribute, the design graph is in 
an inconsistent state. 

Tools immediately report such inconsistencies. However, we allow the archi- 
tect to violate rules and do not stop the design process, because we do not want 
to hinder his creativity. 

Fig. 10 shows an example production, which checks whether the value of an 
attribute, defined in the model graph, corresponds to the attribute value in the 
design graph. Whereas the nodes ‘1 and ‘2 describe an area model and an 
attribute defined in the domain model graph, the nodes ‘3 and ‘4 describe an 
area and an attribute defined in the design graph. The first two lines of the 
condition clause ensure that only these nodes of the design graph (node ' 3 and 
‘ 4) are found, which correspond to the area model (node ‘ 1) and its attribute 
(node ‘ 2) . The next two lines of the condition clauses demand the attributes to 
be false in the domain model graph (node ‘2) and to be true in the design 
graph (node '4). So, an inconsistency between the domain model graph and 
the design graph is found. In this case, on the right side of the production, the 
new node 5 ’ is inserted to mark this inconsistency and to store a specific error 
message. 
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production d_CheckAreaAttribute( AreaModel : m_AtomicAreaType ; 
Attribute : m_boolAreaAttribute) 




condition ‘ 1 . areaModelDescr = ‘3.d_areaModelDescr ; 

‘2.attributeModelDescr = ‘4.d_attributeModelDescr; 

‘ 2 . attributeValueDef inition= false ; 
‘4.d_attributeValueDef inition = true ; 
transfer 5’. message := "Wrong Attribute Value"; 
end ; 

Fig. 10. Analysis to check the consistency of bool attributes 



6 Conclusion and Discussion 

6.1 Summary and Discussion 

In this paper we introduced a specification method for tools in the domain of 
civil engineering. Different tools provide support for knowledge engineering and 
conceptual design, respectively. Analyses within either the knowledge engineer 
or the architecture tool guarantee internal consistency with the basic knowl- 
edge interactively introduced. Furthermore, analyses guarantee the consistency 
of a design result with the building type specific knowledge. Correspondingly, 
the specifications are split into three parts. The interactively elaborated domain 
knowledge consists on the one side of a basic part which is useful for several 
classes of buildings. The specific part on the other side represents the knowledge 
about one class of buildings. 

The specification of the knowledge engineering tools allows to introduce basic 
model nodes for attributes and relations. Furthermore, the specific knowledge 
is elaborated by model instance nodes for areas, relations and attributes. The 
complete information, dependent on the input of the knowledge engineer, is kept 
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in the domain model graph. This information is used by the specification for the 
designer tools, namely by invoking the analyses between designer results and 
specific domain knowledge. 

So, resulting tools are parameterized. In the same way, the (architecture) 
tool specification is parameterized in the sense that it depends on specific knowl- 
edge to be put in, altered, or exchanged. More specifically, it uses a host graph 
produced by the knowledge engineer specification. The interactively determined 
knowledge information can be regarded as dynamic type information. 

Fig. 11 shows both approaches , namely the traditional and parameterized one, 
to specify tool behavior. In the traditional way (left side) the specific knowledge 
is contained in the specification of the architecture tool. Whenever the knowledge 
changes, the specification has to be changed and an automatic tool construction 
process has to be started. On the right side there is a sketch of the parameterized 
approach presented in this paper. The knowledge engineer tool has, in its initial 
form, no specific knowledge. This is interactively elaborated. The resulting host 
graph (domain model graph) acts as typing information for the architecture tool. 
The basic knowledge information is imported by the design tool. The specific 
knowledge information is used for building type-dependent analyses of a concrete 
design result. 
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6.2 Related Work in Civil Engineering 

Both specification methods have pros and cons. If the knowledge is fixed, then the 
traditional way is advantageous. More checks can be carried out at specification 
elaboration time, which is more efficient. If the underlying knowledge changes, 
as it is the case with our experimentation platform, the parameterized method 
is not only better but necessary. Here, changes of the underlying knowledge need 
no modification of tools. The price is to have more and more complicated checks 
at tool runtime due to levels of indirectness which are more costly. Furthermore, 
the specifications do contains less structural graph information and, therefore, 
are more difficult to read and write. 

Let us now compare the results of this paper with other papers in the area 
of conceptual design in civil engineering on one side, and with other graph spec- 
ification approaches on the other. Let us start with the design literature and 
concentrate on those which also use graphs. There are several approaches to 
support architects in design. Christopher Alexander describes a way to define 
architectural design pattern [1], Although design pattern are extensively used in 
computer sciences, in architectural design this approach has never been formal- 
ized, implemented and used. In [8] Shape Grammars are introduced to support 
architectural design, e.g. the design of Queen Ann Houses [6]. The concept of 
shape grammars is related to graph grammars. However this approach rather 
supports a generation of building designs than an interactive support while de- 
signing, what we propose. 

Graph technology has been used by [9] , to build a CAD system that supports 
the design process of a kitchen. In contrast to our approach, the knowledge 
is hard-wired in the specification. In [4] [3] graph grammars are used to find 
optimal positions of rooms and to generate an initial floor plan as a suggestion for 
the architect. Formal concept analysis [18] and conceptual graphs [ 7] describe 
a way to store knowledge in a formally defined but human readable form. The 
TOSCANA systems [5] describes a systems to store building rules. 



6.3 Comparison to Other GraTra Specification Approaches 

Finally, we are going to relate our graph specification method to others in the 
area of graph technology. We concentrate on those papers where typical and 
different tool specification methods are applied. In the AHEAD project [10], 
a management system for development processes is developed. AHEAD distin- 
guishes between a process meta model, to define the general knowledge hard- 
wired in the specification, and the process model definition to represent domain 
specific knowledge, which can be elaborated or changed at runtime. Neverthe- 
less, the tool construction process has to be run again to propagate changes to 
the AHEAD prototype. 

In the EC ARES project [13] graph-based tools are developed to support the 
understanding and restructuring of complex legacy telecommunication systems. 
The specific domain knowledge consists in this case e.g. of the formal definition 
of the underlying programming language to be found in a specific specification. 
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As result of a scanning and parsing process a host graph is automatically created 
representing a system’s structure. Changing the specific knowledge, the parser 
and the specific part of the PROGRES specification have to be adapted and the 
tool construction process has to restart. 

In the CHASID project [7] tools are introduced to support authors writing 
well-structured texts. Its specification method resembles to the one presented in 
this paper. The specific domain knowledge is here stored in so called schemata , 
they are again elaborated at runtime. In contrast to our approach, however, 
the defined schemata are directly used to write texts and not to be checked 
against a text to uncover structural errors. So, the main advantage of the new 
specification method of this paper is a gain in flexibility! 
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Abstract. The use of UML extension mechanisms for the definition of 
an Agent-Oriented Modeling Language only fixes its syntax. But agent 
concepts demand an appropriate semantics for a visual modeling lan- 
guage. Graphs have been shown to constitute a precise and general se- 
mantic domain for visual modeling languages. The question is how agent 
concepts can be systematically represented in the semantic domain and 
further on be expressed by appropriate UML diagrams. We propose a lan- 
guage architecture based on the semantic domain of graphs and elements 
of the concrete syntax of UML. We use the proposed language architec- 
ture to define parts of an agent-oriented modeling language. 



1 Introduction 

Agents and related concepts have been shown to be useful abstractions that 
extend standard Object-Orientation. To employ these concepts in practical soft- 
ware development they need to be represented in the models proceeding the 
system implementation. Thus, there is a need for a modeling language that incor- 
porates agent-based features. Building such a language only upon the standard 
00 modeling language for the software industry, the UML, proves to be difficult. 
UML and the underlying language architecture (MOF) focus on the definition 
of syntax only and neglect semantic issues (see also Sect. 2.4 for details). 

Yet, a precise denotation of the semantics of added features is important 
for several reasons: The definition of a new language is motivated by new se- 
mantic concepts. For example, an agent-oriented modeling language is based on 
the concept of an agent which is different from other concepts like that of an 
object (cf. [7]). A precise semantics of the language is necessary for consistently 
reflecting agent concepts like roles, protocols, and goals. Moreover, unique re- 
quirements for the implementation of an agent based system rely on the precise 
description of the structure and the behavior of a system. 

The question arises whether a new modeling language is necessary at all. In 
section 3 we will show that the available features of UML, without extension and 
semantics, are not sufficient to support the agent concepts of role and protocol 
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HPRN-CT-2002-00275, [Research Training Network SegraVis], 
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appropriately. Thus, UML is tailored to a specific agent-oriented modeling pro- 
file with an appropriate semantics. In this way, the syntactically correct usage 
of the language ensures the correct application of agent concepts. Because the 
syntax of the language is defined using the extension mechanisms of UML we do 
not introduce a completely new syntax and the acceptance of the language is en- 
hanced. In a nutshell, we aim at defining a UML-based agent-oriented modeling 
language in a better way. 

In order to introduce a precise semantics of the agent-oriented modeling lan- 
guage (AML) we rely on the semantic domain of graphs. We propose a language 
design with three different levels of description, see Sec. 2. On the instance level 
we have instance graphs which represent states of the modeled system. On the 
model level we use typed attributed graph transformation systems (TAGTS) 
which consist of type graphs and graph transformation rules. They are used for 
expressing structural and dynamical models of systems. On the meta level we use 
a meta type graph for restricting the type graphs of TAGTS to desired elements 
of the language. 

Orthogonal to the structure of the semantic domain is its relationship to 
the syntax of a modeling language. A solution to the problem is to relate the 
concrete syntax to the abstract syntax of the language and to transform the 
abstract syntax to the semantic domain. This solution has the advantage that 
the abstract syntax usually offers a homogeneous representation of the language 
which abstracts from aspects like the visual presentation of language elements. 
The next step is to relate the abstract syntax to the semantic domain. 

UML provides such a homogeneous abstract syntax. The meta model is given 
by class diagrams and user models are just instances of the classes. The concrete 
syntax is specified rather informally by example diagrams and descriptions of 
the mapping to the meta model. UML can be tailored for specific purposes of 
use by the profile mechanism. Thus, there are means to define the syntax of 
specific modeling languages. In the next section we present a language design 
which provides a mapping of the UML based syntax of a modeling language to 
the semantic domain of graphs being stratified in three levels of description. 

In section 3 we show how the proposed language architecture can be used to 
express the agent concepts of role and protocol in an agent-oriented modeling 
language. 

2 A Graph Based Language Architecture of AML 

The structure of the agent-oriented modeling language (AML) is determined 
by its syntax and semantics. The concepts of a language essentially reflect in 
the semantic domain of the language which we deal with in this section. We 
choose graph transformation as semantic domain because it has been shown 
to be appropriate for visual languages (cf. [8]). The semantic domain of graph 
transformation is structured in three different abstraction levels (see Fig. 1). We 
proceed to explain the different levels in more detail. 
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2.1 The Semantic Domain of Graph Transformation 

The semantic domain of graphs represents instances, types, and meta types. The 
relationships among different levels of abstraction are formalized by notions from 
graph theory. The operational semantics of models is given in terms of rule-based 
graph transformation. 

More precisely, we use typed attributed graph transformation systems as 
a semantic domain. Graphs are attributed if their vertices or edges are coloured 
with elements of an abstract data type (like Strings or Integers) [12, 4]. In our 
context, entities which carry attributes will only be represented as vertices. Thus, 
graphs with coloured vertices are sufficient for us. 

Mathematically, abstract data types are represented as algebras over appro- 
priate signatures. A many sorted signature consists of type symbols (sorts) and 
operation symbols. For each operation its arguments’ types and its result type 
are defined. An algebra for a signature consists of domain sets for the type sym- 
bols of the signature and it defines for each operation symbol of the signature an 
operation with respect to the domain sets. The integration of graphs and algebras 
results in attributed graphs [12]. Attribute values are elements from a domain of 
the algebra. In an attributed graph they are contained as data vert-ices in the 
set of vertices. Attributes are considered as connections among non data entity 
vertices and data vertices, i.e. , attribute values. The introduction of attribute 
values from an algebra ensures that graph transformations change values only 
according to the laws of the algebra. 

On the instance level system states are given by attributed graphs. Entity 
vertices are used to represent agents, roles, messages, etc. as members of a sys- 
tem state defined by an attributed graph. Data vertices are attribute values for 
agents, roles, etc. Each of these data vertices is connected by an edge to the 
entity vertex which possesses the belonging attribute. 
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An attributed graph morphism is a mapping between two graphs which re- 
spects the connections between vertices and edges [12]. Additionally, data ver- 
tices are only mapped to data vertices. An attributed type graph is an attributed 
graph in which the elements of the algebra represent the type symbols (sorts) 
of the data vertices. Now, an attributed graph is typed by an attributed type 
graph if there exists an attributed graph morphism from the former to the latter 
graph. 

As usually, a graph transformation rule consists of a left hand side and a right 
hand side. The rule is typed if both sides are typed by the same type graph. An 
attributed type graph together with a set of typed graph transformation rules 
form a Typed Attributed Graph Transformation System (TAGTS) [12]. The oper- 
ational semantics of a TAGTS depends on the chosen graph rewrite mechanism. 
We choose the DPO approach in which every change of the state must be ex- 
plicitly specified in the rules (cf. [7]). 

We use three different levels of description: on the model level structural and 
dynamic properties of a system are described. Such models must conform to 
the language description on the meta level. On the instance level system states 
evolve according to the system description on the model level. 

On the model level typed attributed graph transformation systems (TAGTS) 
provide the semantic domain of agent-oriented models, see Fig. 1. The attributed 
type graph of a TAGTS is used for typing attributed graphs on the instance level, 
i.e., system states. 

The attributed type graph of a TAGTS is itself typed by the attributed 
meta type graph on the meta level, see Fig. 2. This graph determines the kinds 
of graph elements and admissible graphs on the model level. 

The set of graph transformation rules of a TAGTS determines the dynamics 
of the system. The graph rules of the TAGTS must conform to the attributed 
type graph of the TAGTS. On the instance level subsequent system states are 
generated by graph transformations that result from the application of graph 
transformation rules to an admissible system state. 

We will discuss the syntactical aspects of the language in section 2.2. Next, 
we use the presented language architecture to define the structure of an agent- 
oriented modeling language. 

We regard essential elements of the agent-oriented modeling language AML 
and relate them in a meta type graph, see Fig. 2. The structure of this graph 
is motivated by the agent concepts and their dependencies. In section 3, the 
concepts of role and protocol are discussed in more detail. Next, we will motivate 
parts of the meta type graph roughly. A detailed discussion of the requirements 
for an agent-oriented modeling language can be found in [7]. 

The initial element is the Agent which contains attributes and operations. 
A Data element is considered as data container which contains attributes and 
links to other Data elements. Roles are attached to agents in order to make agents 
interact. Roles contain attributes, operations and messages. They are able to send 
or receive messages through the execution of one of their operations. A message 
is named and it has a list of parameter data entities. A protocol comprises roles 
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Fig. 2. The meta type graph of structural elements of AML 



and data. A subset of the set of data entities may be marked as parameters of 
the protocol. These parameters are substituted by data entities of the context 
in which the protocol is used. Attributes are given by nodes denominated with 
their sort. In Fig. 2 the sort String is introduced. It is connected to all entities 
which posses attributes of type String. 

The structure of the semantic domain influences the syntactic representation 
of models which we discuss now. 

2.2 The Syntax of Models 

We deal with the syntax of models by introducing a profile for the agent-oriented 
modeling language AML, see Fig. 3. For every vertex in the meta type graph 
there is introduced a stereotype in the profile. For each stereotype its base meta 
class is designated. A description explains the stereotype and a constraint re- 
stricts its application. The constraints of the stereotypes transfer semantic in- 
formation to the syntax of the language. In Fig. 3, the constraints establish an 
(incomplete) restriction of UML language elements in order to represent agent 
concepts appropriately. A more formal description of constraints results from 
using a constraint language like OCL. The formalization is omitted here. 

On the model level the semantic domain consists of a TAGTS which com- 
prises a typed attributed graph and a set of graph transformation rules. The 
typed attributed graph is represented by an agent class diagram , see Fig. 5. The 
typing of the model elements is expressed by using stereotypes from the profile of 
the language AML to the graph elements’ identifiers. In this way the attributed 
type graph morphism from the TAGTS to the attributed meta type graph is 
represented syntactically. 

A graph transformation rule is represented by a stereotyped package diagram 
in which the left and right hand sides are distinguished. Two packages named 
Left and Right are inserted in the rule package and contain diagrams for the left 
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Fig. 3. The UML profile of AML 



and right hand sides of the rule which conform to the AML profile. An example 
is given in Fig. 4. The diagrams represent the left and right hand sides of a graph 
transformation rule which are both typed by the attributed type graph of the 
TAGTS. The correct typing is ensured by the use of the AML profile. We prefer 
this new notation because the left hand side and the right hand side of a rule are 
clearly separated. This is different to notations relying on UML collaboration 
diagrams. Also, the labels of the packages accurately indicate the context of the 
rule. 

Before we deal with the semantics of agent interaction protocols we compare 
our approach to some related work. 
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Fig. 4. A <<Rule>> package diagram 



2.3 Related Approaches to Agent-Oriented Modeling 

Yet, there exist some approaches to agent-based modeling. We regard Agen- 
tUML , Tropos and MaSE which introduce agent concepts or use UML. 

AgentUML is a prominent proposal for the extension of UML with language 
elements for agent concepts [1]. But there are two problems with AgentUML. 
First, the extension of UML neither does use the extension mechanisms of UML 
explicitly nor does it describe the extension at least on the level of abstract 
syntax, i.e. , the meta model. Second, AgentUML lacks a formal semantics which 
reflects agent concepts precisely. 

Tropos is another approach to agent-oriented modeling which is intended to 
support different activities of software development [10]. Different diagrams are 
introduced for capturing actors and their respective dependencies in an organiza- 
tional model. Goals and their dependencies are depicted in diagrams of a specific 
non-UML type. In the design activity diagrams use elements of AgentUML [1]. 
Altogether, Tropos supports visual modeling of agent-based systems to some 
degree and it barely relies on UML. 

Deloach et al. [5] introduce the Multi-agent Systems Engineering (MaSE) 
methodology for developing heterogeneous multiagent systems. MaSE uses 
a number of graphically based models to describe system goals, behaviors, agent 
types, and agent communication interfaces. The visual models are only roughly 
based on UML. Thus, the approach lacks a precise syntax, e.g., given by a UML 
profile. The semantics of the models is not defined. 

Next, we discuss related work concerning the definition of modeling lan- 
guages. 
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Fig. 5. An agent class diagram 



2.4 Related Meta Modeling Approaches 

The most common way of defining modeling languages (at least the one stan- 
dardized by the OMG) is the Meta Object Facility (MOF) [13]. The MOF intro- 
duces 4 layers of models. The abstract syntax of each layer is defined by providing 
a class diagram on the level above it. As models are thus used to define modeling 
languages, the term meta-modeling has been coined for this approach. Each layer 
of the MOF may define concrete syntax representations for the elements it intro- 
duces. For instance, the UML, the best known MOF-based language, introduces 
lots of diagrams which all provide their own special notation but are defined in 
terms of the MOF meta-model. The widely known shortcomings of the MOF 
approach are that it does not supply facilities to define the semantics domain 
and that the instance relation between the different levels is not properly defined 
(see e.g. the various discussions in the precise UML group). Our approach ad- 
dresses these problems by providing an explicit semantics domain (graphs) and 
by using well-known structures in this domain (typegraph-relations) to precisely 
define the meaning of level-spanning instantiation. 

Other approaches have also set out to extend the MOF to improve its precise- 
ness. A feasibility study for IBM [3] suggests a recursive structure of definition 
levels in which each level is used to define the structure of the level beneath it 
as well as the instantiation concept that this level should provide. This instan- 
tiation concept is also the (denotational) semantics since the purpose of each 
level is to define the constructs on the lower levels. The semantics of behavioral 
constructs is not addressed in this approach. 

Another approach [2] proposed a mathematical notation based on sets for 
the precise definition of UML’s semantics. Sets are also underlying the approach 
in [14] which suggests Model Transformation Systems , a graph-transformation- 
based approach extended by high-level control constructs to provide a descrip- 
tion of the dynamic semantics. Our approach avoids the use of such additional 
constructs and uses standard graph transformations only. 

Graph Transformations have furthermore been used to provide operational 
semantics for the UML in an interpreter-like style [8] and in a compiler-like 
translation approach [11]. Our approach uses the same basic ideas but takes 
a more fundamental view and investigates the impact that graph transformations 
as a semantic domain have for the whole language architecture. 
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3 Modeling Agent Interaction Protocols with AML 

The modeling of agent based systems must take properties and specific behav- 
ior of agents into account. Typical properties of agents are their autonomous 
behavior, structured interaction and proactivity (cf. [7]). We concentrate on au- 
tonomy and interaction. Autonomy emphasizes the fact that agents have control 
over their own operations: They are not called from outside like methods of an 
object but are only invoked by the agent itself. Thus, agents own separate threads 
of control, they have only little (and explicitly stated) context dependencies, and 
they interact in an asynchronous way. 

We aim at developing a modeling language which captures the interaction be- 
havior of agents in the semantic domain. Therefore, the afore proposed language 
architecture will be applied. 

In order to support the general reuse of often occurring interaction patterns 
the interaction between agents is often described in terms of protocols, which 
represent templates of coordinated behavior (see, e.g., [9]). Instead of making 
reference to the agents involved in an interaction, reusability requires that we 
reduce the description of the participating agents to such features and properties 
that are relevant to the protocol. For this purpose, the concept of role will be 
used. In the protocols roles interact on behalf of the agents. From this idea 
there result specific requirements to roles (cf. [7]). We regard only some of the 
requirements. 

Unlike interfaces in UML roles are instantiable because state information is 
necessary if roles are protocol participants that have to store the state of an 
interaction. A role instance is bound to an agent in order to make him interact. 
Roles can be dynamically attached to and retracted from an agent. If a role is 
attached to an agent then the agent’s state must change during the interaction. 
Otherwise, the interaction would not have any effect on the agent. 

Roles are specified in terms of attributes, operations and messages. Attributes 
store the state of an interaction and operations of roles are able to send and 
receive messages. A role becomes used if it is bound by the role-of relationship 
between the role and a base agent. In order to establish an interaction between 
some agents the role-of relationship has to carry a suitable semantics which 
fulfills the above mentioned requirements. Considering the behavior of a role 
bound to a base agent, the operations of a role must also affect the state of 
its base agent. Next, we will discuss how this specific semantics of roles can be 
modeled precisely. 

Now, we discuss how the requirements can be fulfilled in UML. UML offers 
sequence diagrams which can be used to describe the message exchange between 
instances. But sequence diagrams do not allow the description of internal state 
changes of the concerned instances when a message is sent or received. The 
same problem exists with collaboration diagrams which offer very similar fea- 
tures. Besides the collaboration diagrams on the instance level there are also 
collaboration diagrams on the specification level. They allow the definition of 
behavioral patterns for some context of collaboration roles and the interaction 
within this context. The use of a pattern demands substitution instances for the 
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Fig. 6. The template package for the structure of the request-response protocol 



collaboration roles. In our case an agent had to be substituted for a collabora- 
tion role. This kind of “binding” is not useful in our case because an agent is 
able to play different roles in different interactions. Instead, we use the template 
mechanism of UML. By template packages we will define both the structure of 
a protocol and its behavior given by graph transformation rules. 

An example shows how protocols are given in a notation for agent-specific 
model elements. We use a trivial request-response protocol which is a typical ex- 
ample of a connector between agents enabling interaction (see [9]). Note, that the 
protocol is rather incomplete. Especially, the behavior of the responder must be 
refined in order to be useful. Anyhow, we are able to demonstrate how protocols 
are used in our approach. 

The structure of the request-response protocol is given in Figure 6. The 
protocol is given by an agent class diagram which is contained in a template 
package. The parameters Request and Response of the template refer to two 
different data classes within the package. 

The operations send Request and send Response are specified by use of graph 
transformation rules, see Figure 7 and Figure 4. Rules are given by a pair of 
instance diagrams which represent the left and right hand sides of the rule. In 
the rules, the delivery of a message is represented by an arrow whose head is 
attached to the receiver of the message. All rules of the protocol are integrated 
in one package for the protocol rules, see Figure 7. 

The data entity Request is used as a parameter of the operation send Request 
which is used by the role Requester in order to send a request message rq to 
a receiver. The role Responder uses its operation send Response to answer the 
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Fig. 7. The rules of the request-response protocol 



request. It attaches a message ack with parameter data Response to the role 

Requester. 

The binding of a protocol to a given model is a pure syntactic mechanism 
of macro expansion: The parameters of the protocol template are filled in with 
model elements from a given agent class diagram (shown in Fig. 5). Then, the 
model elements of the protocol are inserted in the agent class diagram and the 
role classes are attached to agent classes by the role-of relationship. 

The graph transformation rules of the protocol are expanded in the following 
way: For each role in a rule an agent is inserted in the rule. The role and the agent 
are connected by the role-of relationship. The correlation of roles and agents 
must conform to the expanded agent class diagram. Then, model elements which 
have been assigned to protocol parameters are inserted in the rule. Now, these 
expanded graph transformation rules reflect the change of the system behavior 
through the expanded protocol. 

We show the expansion of the request-response protocol. Syntactically, a role- 
of relationship is depicted as an arrow with a filled in triangle head. In Figure 8 
the binding of a protocol is shown with respect to the two agents browser and 
server which interact by playing the roles Requester and Responder. The data 
entities of type Address and Content act as Request and Response. 

In Figure 9 the binding of the rule send Request is depicted in a rule diagram. 
The rule is extended by instances of the base agents to which the roles of the 
protocol are bound according to the agent class diagram in Figure 8. The rule is 
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Fig. 8. Two agents use the request-response protocol (agent class diagram) 



also extended by an instance of a protocol parameter (Address) in order to affect 
a part of the (system) state which is related to the base agent. 

In the context of our proposed language architecture the resulting agent class 
diagram and the rule diagrams map to a corresponding type graph and to graph 
transformation rules of a TAGTS. Thus, the modification of the agent class 
diagram and rule diagrams becomes semantically effective. 

The intended semantics of agent interaction by protocols is precisely enabled 
by a combination of a pure syntactical expansion mechanism and a mapping 
to the semantic domain of graph transformation. In the context of protocol 
expansion graph transformation is an appropriate semantic domain because the 
necessary change of system behavior is achievable by pure structural modification 
of graph transformation rules. It is not necessary to change the operational 
semantics of the TAGTS in order to enable protocol-based interaction. 

Thus, the proposed language architecture fits well for the syntax and seman- 
tics of elements of an agent-oriented modeling language. 

4 Conclusion 

In this paper we introduced a language architecure that relies on visual elements 
from UML and on the semantics of graph transformation. The abstract syntax 
of diagrams of the new language is tailored to the needs of the semantic domain 
of graphs. We have shown how concepts for elements of an agent-oriented mod- 
eling language can be described precisely within the architecture. Based on the 
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Fig. 9. The rule sendRequest bound to an agent (rule diagram) 



operational semantics of the DPO approach we demonstrated the semantics of 
protocol based interaction of agents. 

In the next step more elements of an agent-oriented modeling languages are 
to be integrated. The semantic relation of the different elements is to be de- 
scribed in more detail. Further on, the use of the modeling elements in a software 
development process must be clarified. 

The precisely defined agent-oriented models can be used to reason about 
properties of agent based systems. For example, by applying techniques like 
model checking it becomes possible to decide whether an agent is able to reach 
a certain state. With respect to agents this is an important question because 
particular states of an agent often characterise goals which an agent tries to 
reach. First results regarding these aspects are contained in [6]. 
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Abstract. In this paper we make use of formal methods and tools as 
means to specify and reason about the behavior of distributed systems in 
the presence of faults. The approach used is based on the observation that 
a fault behavior can be modeled as an unwanted but possible transition 
of a system. It is then possible to define a transformation of a model Mi 
of a distributed system into a model M2 representing the behavior of 
the original system in the presence of a selected fault. We use a formal 
specification language called Object Based Graph Grammars to describe 
models of asynchronous distributed systems and present, for models writ- 
ten in terms of this language, the transformation steps for introducing 
a set of classical fault models found in the literature. As a result of this 
process, over the transformed model(s) it is possible for the developer 
to reason about the behavior of the original model(s) in the presence of 
a selected fault behavior. As a case study, we present the specification 
of a pull-based failure detector, then we transform this model to include 
the behavior of the crash fault model and analyze, through simulation, 
the behavior of the pull-based failure detector in the presence of a crash. 

1 Introduction 

The development of distributed systems is considered a complex task. In partic- 
ular, guaranteeing the correctness of distributed systems is far from trivial if we 
consider the characteristics open systems, like: massive geographical distribution; 
high dynamics (appearance of new nodes and services); no global control; faults; 
lack of security; and high heterogeneity [21]. Among other barriers, in open en- 
vironments (e.g. Internet) it is hard to assure the correctness of applications 
because we cannot be sure whether a failure is caused by a fault in the system 
under construction itself or by the environment in which it runs. It is there- 
fore necessary to provide methods and tools for the development of distributed 
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systems such that developers can have a higher degree of confidence in their 
solutions. We have developed a formal specification language [4], called Object 
Based Graph Grammars (OBGG), suitable for the specification of asynchronous 
distributed systems. Currently, models defined in this formal specification lan- 
guage can be analyzed through simulation [1] [5]. Moreover, in our activities we 
are also working on an approach to formally verify, using model checking tools, 
models defined in OBGG. Besides the analysis of distributed systems specified in 
OBGG, we can also generate code for execution in a real environment, following 
a straightforward mapping from an OBGG specification to the Java program- 
ming language. By using the methods and tools described above we have defined 
a framework to assist the development of distributed systems. The innovative 
aspect of this framework is the use of the same formal specification language 
(OBGG) as the underlying unifying formalism [6]. 

The results achieved so far, briefly described above, have addressed the devel- 
opment of distributed systems without considering faults. In order to properly 
address the development of distributed systems for open environments, in this 
paper we present a way to reason about distributed systems in the presence of 
selected fault behaviors. Thus, we revise our framework in order to consider some 
fault models found in the literature [9]. The rationale is to bring the fault be- 
havior to the system model and reason about the desired system in the presence 
of faults. To achieve that, we show how to introduce selected fault behaviors in 
a model written in terms of the formal specification language OBGG. 

As stated in [10], a fault can be modeled as an unwanted but possible state 
transition of a system. Moreover, it states that the transformation of an original 
model M\ into a model M 2 considering a kind of fault consists on the insertion 
of virtual variables and statements that define the fault behavior. We show how 
to perform this transformation process for models written in OBGG, such that 
the transformation from M\ into M 2 preserves that all possible computations 
of M\ and adds the desired fault behavior. Having the transformed model M 2 , 
the developer can reason about its behavior (currently through simulation tools, 
but formal verification using model checking tools is being integrated to the 
framework). Code for execution in a real environment can be generated using 
the model M\. While running in a real environment, if the environment exhibits 
the fault behavior corresponding to the one introduced in M 2 , then the system 
should behave as expected during the analysis phase. 

This paper is organized as follows: Section 2 discusses related work; Section 3 
presents the formal specification language OBGG and introduces a case study 
used throughout the paper; Section 4 shows how to transform a specification to 
incorporate selected fault behaviors; Section 5 briefly analyzes the case study; 
finally, Section 6 brings us to the conclusions. 

2 Related Work 

Concerning related works, we have surveyed the literature trying to identify 
approaches that allow developers to reason about distributed systems in the 
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presence of faults. As far as we could survey, we could not identify other ap- 
proaches based on specification transformations, providing various fault models 
and supporting analysis based on tools. 

The SPIN (Simple Promela INterpreter) model checker [12] enables a devel- 
oper to specify a system (using the formal specification language PROMELA 
(PROcess/PROtocol MEta LAnguage)) and formally verify, using model check- 
ing. The similarity is that SPIN allows one to check models using channel ab- 
stractions that may loose messages. The fault behavior is provided by the sup- 
porting tool. Other fault behaviors are not provided. 

Another work, but related to the development of mobile systems, is presented 
in [8]. The Distributed Joint Calculus is a process calculus that introduces the 
notions of locations and agents, where locations may be nested. The calculus 
provides the notion of crash of a location, whereby all locations and agents 
in the crashed location are halted. A crash of a location changes the behavior 
of basic characteristics hindering any kind of reaction, like for communication 
or process migration. Systems that are represented with this calculus can be 
formally verified, but using theorem proving, requiring better skilled developers. 

In [17] I/O-Automata are proposed as means for specifying and reason- 
ing about distributed algorithms for asynchronous environments. I/O-Automata 
were designed to have nice features, such as the composition of I/O-Automata 
resulting in another I/O-Automata. Also, it is possible to derive properties of 
the composed automata from the analysis of its component automata. In [16] 
a rich set of distributed algorithms was modeled with I/O-Automata. In many 
of these, faults are taken into account and new versions of the algorithms are 
proposed. In this paper we have focused on fault representation, and not on how 
to handle a specific kind of fault in a given system or distributed algorithm. Con- 
sidering the techniques around I/O-Automata, however, we could not identify 
an approach such as the proposed in this paper, whereby a fixed transformation 
step to embed a selected fault behavior can be carried out in the same way for 
different models. Representation of fault behavior with I/O-Automata is carried 
out by manually extending state and transitions of the model, for each case. 

3 Object Based Graph Grammars 

Graphs are a very natural means to explain complex situations on an intuitive 
level. Graph rules may complementary be used to capture the dynamical as- 
pects of systems. The resulting notion of graph grammars generalizes Chomsky 
grammar from strings to graphs [20, 7]. The basic concepts behind the graph 
grammars specification formalism are: 

— states are represented by graphs ; 

— possible state changes are modeled by rules , where the left- and right-hand 
sides are graphs; each rule may delete, preserve and create vertices and edges; 

— a rule have read access to items that are preserved by this rule, and write 
access to items that are deleted/changed by this rule; 
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— for a rule to be enabled, a match must be found, that is, an image of the 
left-hand side of a rule must be found in the current state; 

— an enabled rule may be applied, and this is done by removing from the 
current graph the elements that are deleted by the rule and inserting the 
ones created by this rule; 

— two (or more) enabled rules are in conflict if their matches need write access 
to common items; 

— many rules may be applied in parallel, as long as they do not have write 
access to the same items of the state (even the same rule may be applied in 
parallel with itself, using different matches). 

Here we will use graph grammars as a specification formalism for concurrent 
systems. The construction of such systems will be done componentwise: each 
component (called entity) is specified as a graph grammar; then, a model of the 
whole system is constructed by composing instances of the specified components 
(this model is itself a graph grammar) . Instead of using general graph grammars 
for the specification of the components, we will use object-based graph gram- 
mars (OBGG) [4]. This choice has two advantages: on the practical side, the 
specifications are done in an object-based style that is quite familiar to most of 
the users, and therefore are easy to construct, understand and consequently use 
as a basis for implementation; on the theoretical side, the restrictions guarantee 
that the semantics is compositional, reduce the complexity of matching (allowing 
an efficient implementation of the simulation tool) , as well as eases the analysis 
of the grammar. Basically, we impose restriction on the kinds of graphs that are 
used and on the kind of behaviors that rules may specify. 

Each graph in an object-based graph grammar may be composed by instances 
of the vertices and edges shown in Figure 1. The vertices represent entities and 
elements of abstract data types (the time stamps train and tmax are actually spe- 
cial attributes of a message, we defined a distinguished graphical representation 
for these attributes to increase the readability of the specifications). Elements 
of abstract data types are allowed as attributes of entities and/or parameters of 
messages. Messages are modeled as (hyper) arcs that have one entity as target 
and as sources the message parameters (that may be references to other entities, 
values or time stamps). Time stamps describe the interval of time in which the 
message must occur in terms of the minimum and maximum time units relative 
to the current time. These time stamps ( train and tmax ) are associated to each 
message (if they are omitted, default values are assumed). 

For each entity, a graph containing information about all attributes of this 
entity, relationships to other entities, and messages sent/received by this entity 
is built. This graph is an instantiation of the object-based type graph described 
above. All rules that describe the behavior of this entity may only refer to items 
defined in this type graph. 

A rule must express the reaction of an entity to the receipt of a message. 
A rule of an object-based graph grammar must delete exactly one message (trig- 
ger of the rule), may create new messages to all entities involved in the rule, as 
well as change the values of attributes of the entity to which the rule belongs. 
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A rule shall not delete or create attributes, only change their values. At the right- 
side of a rule, new entities may appear (entities can be dynamically created). 
Besides, a rule may have a condition, that is an equation over the attributes of 
its left- and right-hand sides. A rule can only be applied if this condition is true. 

An Object-Based Graph Grammar consists of a type graph, an initial graph 
and a set of rules. The type graph is actually the description of the (graphi- 
cal) types that will be used in this grammar (it specifies the kinds of entities, 
messages, attributes and parameters that are possible - like the structural part 
of a class description). The initial graph specifies the start state of the system. 
Within the specification of an entity, this state may be specified abstractly (for 
example, using variables instead of values, when desired), and will only become 
concrete when we build a model containing instances of all entities involved in 
this system. As described above, the rules specify how the instances of an entity 
will react to the messages they receive. 

According to the graph grammars formalism, the computations of a graph 
grammar are based on applications of rules to graphs. Rules may be applied se- 
quentially or in parallel. Before going into more details on the way computations 
are built, we will briefly discuss how time is dealt with in our model. 

Time stamps of messages describe when they are to be delivered/treated. In 
this way, we can program certain events to happen at some specific time in the 
future. As rules have no time stamps, we assume that the application of a rule is 
instantaneous. The time unit must be set in the specification. This will be used 
as a default increment for messages without a time stamp as well as to postpone 
messages. Time stamps of messages are of the form: ( tmin , tmax ), with tmin < 
tmax , where tmin and tmax are to be understood as the minimum/maximum 
number of time units, starting from the current time, within which the message 
shall be treated. As the time stamps are always relative to the current time, it 
is not possible to send messages that must be applied in the current time. The 
semantical model used by now is a hard-time semantics: if a message is sent, 
it must be treated within its specified time interval; if this is not possible, the 
computation fails. In this paper, we will not use the maximum time limit for 
messages, that is, in our case, there is no failure due to time constraints. 

Each state of a computation of an OBGG is a graph that contains instances 
of entities (with concrete values for their attributes), messages to be treated, 
and the current time. In each execution state, several rules (of the same or 
different entities) may be enabled, and therefore are candidates for execution 
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at that instant in time. Rule applications only have local effects on the state. 
However, there may be several rules competing to update the same portion of 
the state. To determine which set of rules will be applied in each time, we need 
to choose a set of rules that is consistent, that is, in which no two or more 
rules have write access to (delete) the same resources. Due to the restrictions 
imposed in object-based graph grammars, write-access conflicts can only occur 
among rules of the same entity. When such a conflict occurs, one of the rules is 
(non-deterministically) chosen to be applied. This semantics is implemented in 
the simulation tool PLATUS [2, 1, 5]. 

3.1 Pull-Based Failure Detector 

Now we introduce a pull-based failure detector modeled using OBGG. In this 
example, every pull-based failure detector of the model is modeled as an entity, 
called PullDetector. The type graph and rules of this entity are shown in Figure 2. 
Some internal attributes of the PullDetector entity are abstract data types ( Map 
and List). The Map abstract data type is used to handle a collection of tuples. 
The List abstract data type is the implementation of a chained list. The GS 
entity is the group server for a particular group being monitored. Through the 
GS the pull-based failure detectors can obtain references to communicate with 
other participants of the group being monitored. The detailed definition of the 
abstract data types Map and List, and the GS entity can be found in [ 9] . The 
initial graph of this entity was omitted due to space limitations (in Section 5 we 
show a specification that has some instances of this initial graph) . 

In OBGG there is no embedded notion of local or remote communications. 
However, a developer may specify minimum delays for messages. This notion al- 
lows us to express some differentiation between remote and local communication 
by assigning different minimum delay times. By default, all messages that do not 
have minimum delays specified are treated as having the lowest minimum delay 
possible. In the case study presented in this section we have explicitly specified 
only the minimum remote communication times ( mrct ). 

Figure 2 shows the rules that define the behavior of the PullDetector entity. 
A PullDetector entity monitors the activity of a known group of PullDetector 
entities (attribute groupserver) using liveness requisitions. The monitoring be- 
havior is started at the beginning of the system, where a Qliveness message is 
sent to each PullDetector entity of the system. The Qliveness message indicates 
the beginning of an interval of dl units of time (rule ReqLivenessl ) . During 
this interval other entities (processes) will be asked if they are alive. To do 
that, a PullDetector has an associated GS (group server) which records all the 
other PullDetectors of the group. A PullDetector then asks GS these PullDe- 
tectors (rule ReqLiveness2) , receives References to them, and sends messages 
AYAlive for each one (rule AreYouAlive). Responses to AYAlive messages, via 
ImAlive messages (rule IamAlive), are accepted until the interval dl has not 
expired. Upon reception of an ImAlive message during the valid interval time, 
a PullDetector entity records every entity that confirms its situation (rule Aliv- 
eRetuml). If the period for the response of monitored entities has finished, for 
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every response message received after the interval time, the PullDetector entity 
does nothing (rule AliveReturn2). After the time interval for receiving ImAlive 
responses finishes, a PullDetector starts a new monitoring cycle, in an inter- 
val of dl units of time, with a Qliveness message (rule MntSuspList). At the 
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beginning of this pause the PullDetector entity builds a local list of suspected 
entities, located in the slist attribute (rule MntSuspList) and sends this list to 
every entity on the monitored group, using a Suspect message (rules SuspBroad- 
cast and SendSuspect). Besides, it sends a message EFFailed to every member 
of the monitored group. Thus, every entity in the monitored group receives (rule 
ReceiveSuspect ) the local suspect list of the PullDetector that originated the 
Suspect message, and raises the union of this list with its own local suspect list, 
originating a unified list (without repeated entries). The reception of the message 
EFFailed causes the originator to be excluded from the suspect list. 

4 Transforming Specifications to Represent Fault Models 

In this section we explain the methodology we have used to represent a subset 
of the classical fault models found in the literature using the OBGG formalism. 
After that, we show how to use (insert) these selected models in an already 
defined model of a desired system. 

Traditionally, terms like fault, error and failure are used as in [15]. Here we 
try to avoid the terms error and failure, and adopt a formal definition, given 
in [10], for fault. As stated in [3] a system may change its state based on two 
types of events: events of normal system operation and fault occurrences. Based 
on this observation, a fault can be modeled as an unwanted (but possible) state 
transition of a system [10]. Thus a fault behavior of a system is just another kind 
of a (programmable) behavior. These unwanted state transitions can be modeled 
through the use of additional virtual 1 variables, acting like guards to activate 
specific commands (guarded commands). In this case, a group of guarded com- 
mands represents a specific fault, i.e. the manner in which the system will exhibit 
the fault behavior, being activated whenever its associated guard is satisfied, by 
the assignment of a true value to it [9]. 

The addition of virtual variables and guarded commands can be viewed as 
a transformation of a model M\ into a model M 2 that contains in its state space 
the behavior of a selected fault model [10] . Here, we adopted these concepts. Since 
our specification formalism supports implicit parallelism and is declarative, it is 
very suitable to represent guarded commands: the left-side of a rule corresponds 
to the guard of the command; applying the rule transformation (according to the 
right side) corresponds to executing the guarded command. Thus we can model 
fault behaviors for an OBGG specification inserting virtual variables (used for 
guards) and messages (used to activate a fault behavior) in every entity of the 
model. Besides, we need both to create rules representing the fault behavior 
introduced and, depending on the fault model, to change original rules defined 
for the entities that appear in the model. Depending on the fault model, different 
rule transformations occur. These transformations are explained in more detail 
in the next sections. 



1 The term virtual is used to qualify variables that are not part of the desired system 
itself, but part of the introduced fault behavior. 
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For the selection of fault models to be described using our formalism we have 
adopted the fault classification found in [13]. There, fault models are classified in: 
crash, omission, timing, and Byzantine. Specifically, with respect to the omission 
model, we have used the classification of [9] , where the omission model is splitted 
up in: send omission, receive omission, and general omission. In the next sections 
we present the modeling of: crash, send omission, receive omission, and general 
omission fault models. Also, we show how to introduce a selected fault behavior 
in an existing system model. We have not yet modeled the timing and Byzantine 
fault models. In the timing fault model a process might respond too early or 
too late [13]. In the Byzantine fault model a process may assume an arbitrary 
behavior, even malicious [ 4]. In the future we intend to model these fault models 
as well. In this paper we concentrated our efforts in the modeling of the previous 
cited fault models, which are very often used in the fault tolerance literature 
(e.g. the crash fault model is commonly considered for distributed systems). 



4.1 Crash Fault Model 

In the crash fault model a process fails by halting. The processes that communi- 
cate with the halted process are not warned about the fault. Below it is shown 
how to transform a model Mi (without fault behavior) into a model M 2 that 
incorporates the behavior of a crash fault model. 

To add the behavior of a crash fault model we perform the following transfor- 
mation on each entity GG = (T, /, Rules) (with type graph T, initial graph I and 
set of rules Rules): insert a message Crash and an attribute in the type graph 
of the entity (depending on the value of this variable, the entity may exhibit the 
fault behavior or not); insert the same message in the initial graph of the entity, 
in order to activate the fault behavior; create a new rule that activates the fault 
behavior of the entity; create new rules whose left-sides are replicas of the orig- 
inal rules left-sides and right-sides specify no activity (let the state unchanged), 
representing the fault behavior once the guard is true ( down is true); modify all 
the rules of an entity with the insertion of a guard (down: false), meaning that 
the entity will exhibit the original behavior only if it is not crashed. These mod- 
ifications generate a new entity GG' = (T\ I ' , Rules') in which the components 
are illustrated by schemes in Figure 3: it shows how the type graph and initial 
graph are transformed, how each rule r generates two rules r' and r" in Rules ' 
and shows the rule to be added EntCrash. 

An example of these rule transformations is presented in Figure 4, showing 
the transformation of the rule IamAlive previously defined for the PullDetector 
entity. Figure 4 (a) shows the normal behavior of the entity when the variable 
down is set to false (the fault behavior is not active) . Once the fault behavior is 
activated, the rule defined in Figure 4 (b) will occur. This rule simply consumes 
the message used to trigger the rule but does nothing (neither change the state 
of the entity nor create new messages) , in this way we incorporate the behavior 
of a crash fault model. 
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Fig. 3. Transformation over a model to represent a crash fault model 



4.2 Receive Omission Fault Model 

In the receive omission fault model a faulty process receives only a subset of the 
messages sent to it [11]. The receive omission fault model is analogous to the 
send omission fault model. The main difference is that in this fault model only 
a subset of the total messages sent to the fault process are received, differently 
from the send omission where only a subset of the total messages sent by the 
process are actually sent. 

To add the behavior of a receive omission fault model we perform the fol- 
lowing transformation on each entity GG = (T, /, Rules): insert an attribute in 
the entity’s type graph (depending on the value of the variable, the entity may 
exhibit the fault behavior); insert a RcvOmit message in the entity’s type graph; 
create a new rule that activates the fault behavior of the entity; create new rules 
whose left-sides are replicas of the original rules left-sides and right-sides specify 
no activity (let the state unchanged), representing the fault behavior once the 
guard is true {rev-omitted, is true); let unchanged the rules of the entity (guards 
are not inserted) , since in the receive omission fault model a process may fail to 
receive only a subset of the total set of messages sent to it. That is why we do not 
insert a guard on the original rules, leaving the choice (once the guard is true ) of 
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(a) 




(b) 

Fig. 4. (a) Rule IamAlive without fault behavior (b) Rule IamAlive with fault behavior 



which rule to apply in a non-deterministic way. These changes are illustrated in 
Figure 5. 

4.3 Other Omission Fault Models 

Due to space restrictions we do not discuss the send omission and general omis- 
sion fault models at the same level of detail, but rather discuss them in terms 
of the ideas already presented. 

A process in a send omission fault fails to transmit a subset of the total 
messages that it was supposed to send [11]. The transformation of a model to 
incorporate the send omission behavior is analogous to the previous transforma- 
tions. However, in this case, some messages that should be sent will simply be 
ignored. Again, in this model we use the non-deterministic choice of rules offered 
by OBGG to model the fact that a message may be sent or not. 

The general omission model states that a process may experience both send 
and receive omissions [18]. Using these concepts we model the general omission 
model as a merge of the previously discussed send and receive omission fault 
behaviors. 



5 Analysis of the Pull-Based Failure Detector 

In this section we exemplify the introduction of fault models described in the 
previous sections using the example of a pull-based failure detector defined in 
Section 3.1. We have applied the transformation for the crash fault model defined 
in Section 4.1, in order to reason about the system in presence of crash faults. 

First, we generated the entity PullDetectorCrash following the construction 
given in Section 4.1. This transformation of the PullDetector model presented 
in Section 3.1 considering the crash fault model generated 23 rules out of the 



Specification and Analysis of Fault Behaviours Using Graph Grammars 



131 



Entity 

atrl: x 
atr2:y 

atrk.z 




Entity 

atrl: xl 
atr2:y 



atrk.zl 





EntRcvOmit 




Fig. 5. Transformation over a model to represent a receive omission fault model 



11 original ones. From these 23 rules, 11 describes the behavior of the model 
without crash, other 11 rules describe the behavior of a crashed pull detector 
and 1 rule serves for the activation of the crash model. 

Second, we build the specification of a system containing three pull detectors 
(three instances of the entity PullDetector Crash) , instantiating in two of them 
the tmin of message Crash as Maxlnt , and as 110 in the other (see Figure 6). This 
models the fact that pull detectors 1 and 3 will not crash, and pull detector 2 will 
have the possibility to crash after 110 time units. When this Crash message is 
received, Pdetector2 entity will halt, assuming a behavior where neither messages 
will be sent nor received, and its state will not change. Two of the three Qliveness 
messages, used for the activation of the entities, in the system have also been 
delayed, such that the entities do not start the detection cycle at the same time. 
Note that the initial graph shown in this figure is just a part of the initial graph 
of the system being modeled, the instantiations of the GS entities are not shown. 
In the execution of this scenario both entities Pdetectorl and Pdetector3 will 
detect that Pdetector2 has failed, and will put the entity in their unified suspect 
list ( olist variable). As presented in Section 1, currently we have, as means to 
analyze the behavior of OBGG formal specifications, a simulation tool. We use 
this tool to generate a log of an execution of the system. 
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Fig. 6. Initial graph for PullDetector entities considering a crash fault model 



6 Final Remarks 

In this paper we have presented an approach for the specification and analysis of 
distributed systems considering the presence of faults. We have shown the defi- 
nitions of crash and omission fault models, the last one being further specialized 
in send, receive and general omission. Moreover, we have shown how to combine 
the behavior of a given system model M\ and one of the fault models defined, 
achieving a model M2. Model M2 can be used to reason about the behavior of 
the original model Mi in the presence of the considered fault. 

The formal specification language we have used to represent models of sys- 
tems as well as to describe fault models is a restricted form of graph grammars, 
called Object Based Graph Grammars. The fact that this formalism is declar- 
ative and supports the notions of implicit parallelism and non-determinism re- 
sulted that transformations from M\ to M2 were rather simple to define (see 
Section 4). Since models described according to the used formalism can be sim- 
ulated, we have then a method and a supporting tool to help reasoning about 
distributed systems in the presence of faults. Simulation can be used as means 
for testing the behavior of models in the presence of faults, as well as for perfor- 
mance analysis. Since we aim also to prove properties over the models we define, 
we are analyzing an approach for model checking models written in OBGG. 
This approach is based on the mapping of OBGG specifications to a description 
language that could serve as input to a model checker. 
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